Back to glossary
AI GLOSSARY
Safety Evaluation
Evaluation & Performance
A systematic assessment of an AI system's behavior across scenarios involving potential harms (including bias, misinformation, dangerous content generation, and susceptibility to misuse). Safety evaluation goes beyond standard performance metrics to ask whether a model behaves responsibly across the full range of ways it might actually be used, including adversarial inputs and edge cases that normal benchmarks don't capture.
See also: red-teaming, model card.
