Back to glossary

AI GLOSSARY

Constitutional AI

Safety, Alignment & Ethics

An approach to AI alignment developed by Anthropic in which a model is trained to evaluate and revise its own outputs according to a set of stated principles, a constitution, rather than relying solely on human feedback for every judgment. Constitutional AI is meant to scale oversight by giving the model explicit values to reason against, reducing dependence on large volumes of human-labeled preference data.
See also: reinforcement learning from human feedback, AI alignment, claude (anthropic).

External reference