Back to glossaryExternal reference
AI GLOSSARY
Benchmark
AI & Machine Learning
A standardized test or dataset used to measure and compare the performance of AI models. Benchmarks give researchers and practitioners a common language for evaluating progress, though a model that scores well on benchmarks does not always perform equally well in real-world use.
See also: Evaluation, baseline, BLEU score.