Back to glossary
AI GLOSSARY
Policy Model
Research & Advanced Concepts
In reinforcement learning, the model that determines which action an agent takes in a given state, the agent's strategy or decision-making function. In the context of large language model training, the policy model is the language model being trained to produce better responses, guided by a reward model that evaluates output quality.