Back to glossary

AI GLOSSARY

Sharding

Deployment & Infrastructure

The practice of splitting a large model or dataset across multiple devices or machines to overcome memory limitations. Each shard holds a portion of the total parameters or data, and the pieces work together as a unified whole during training or inference. Standard practice for training and deploying models too large to fit on a single accelerator.