Back to glossary

AI GLOSSARY

Policy Model

Research & Advanced Concepts

In reinforcement learning, the model that determines which action an agent takes in a given state, the agent's strategy or decision-making function. In the context of large language model training, the policy model is the language model being trained to produce better responses, guided by a reward model that evaluates output quality.