Back to glossary

AI GLOSSARY

Quantization

AI & Machine Learning

A technique that reduces the precision of a model's numerical values, for example storing weights as 8-bit integers instead of 32-bit floating point numbers. This makes models significantly smaller and faster to run with only a modest reduction in accuracy, and is especially useful for deploying models on mobile or edge AI devices.

External reference