Back to glossary
AI GLOSSARY
Sharding
Deployment & Infrastructure
The practice of splitting a large model or dataset across multiple devices or machines to overcome memory limitations. Each shard holds a portion of the total parameters or data, and the pieces work together as a unified whole during training or inference. Standard practice for training and deploying models too large to fit on a single accelerator.