AI GLOSSARY

Content Moderation

Safety, Alignment & Ethics

The practice of reviewing and managing AI-generated or user-generated content to prevent harmful, illegal, or policy-violating material from being produced or distributed. In AI systems, content moderation involves a combination of model-level training, output filtering, and human review, balancing the need to prevent harm against the risk of over-restricting legitimate use.
See also: content filtering, behavioral policy, abuse monitoring.