Back to glossary

AI GLOSSARY

Latency

Deployment & Infrastructure

The time between sending a request to an AI system and receiving a response. Low latency is critical for real-time applications like voice assistants or live chat, while batch processing tasks can tolerate higher latency. Latency is a key consideration when choosing between models and deployment strategies.