Back to glossary
AI GLOSSARY
Latency
Deployment & Infrastructure
The time between sending a request to an AI system and receiving a response. Low latency is critical for real-time applications like voice assistants or live chat, while batch processing tasks can tolerate higher latency. Latency is a key consideration when choosing between models and deployment strategies.