Back to glossaryExternal reference
AI GLOSSARY
AI Safety
Safety, Alignment & Ethics
The broad research and engineering field concerned with ensuring that AI systems behave reliably, predictably, and in accordance with human values, both now and as systems become more capable. AI safety encompasses technical work on AI alignment, robustness, and interpretability, as well as governance and policy work on how to manage the risks that advanced AI systems might pose.
See also: AI alignment, AI containment, interpretability.