Back to glossary

AI GLOSSARY

AI Safety

Safety, Alignment & Ethics

The broad research and engineering field concerned with ensuring that AI systems behave reliably, predictably, and in accordance with human values, both now and as systems become more capable. AI safety encompasses technical work on AI alignment, robustness, and interpretability, as well as governance and policy work on how to manage the risks that advanced AI systems might pose.
See also: AI alignment, AI containment, interpretability.

External reference