Back to glossaryExternal reference
AI GLOSSARY
Quantization
AI & Machine Learning
A technique that reduces the precision of a model's numerical values, for example storing weights as 8-bit integers instead of 32-bit floating point numbers. This makes models significantly smaller and faster to run with only a modest reduction in accuracy, and is especially useful for deploying models on mobile or edge AI devices.