Tiny QA Benchmark++ a micro-benchmark suite (52-item gold + on-demand multilingual synthetic packs), generator CLI, and CI-ready eval harness for ultra-fast LLM smoke-testing & regression-catching.
benchmark evaluation dataset smoke-test synthetic-data qa-dataset huggingface-datasets llm llmops litellm llm-testing tinybenchmarks
-
Updated
May 20, 2025 - Python