Back to glossary
AI GLOSSARY
Deception
Safety, Alignment & Ethics
Behavior by an AI system that creates false beliefs in users or operators, whether by producing factually incorrect outputs, concealing its true capabilities, or strategically misrepresenting its reasoning. Deception in AI systems is a significant safety and alignment concern, as a system that deceives its overseers undermines the human oversight needed to detect and correct misalignment.
See also: deceptive alignment, AI alignment, transparency.