AI GLOSSARY

AI Alignment

Safety, Alignment & Ethics

The challenge of ensuring that an AI system's goals, values, and behaviors are consistent with human intentions and values, that it does what we actually want, not just what we literally specified. Alignment is considered one of the most important and difficult problems in AI safety, because as systems become more capable, the consequences of misalignment become more severe and harder to correct.
See also: AI safety, reward hacking, interpretability.

External reference