Back to glossary

AI GLOSSARY

Image Captioning

Computer Vision

The task of automatically generating a natural language description of an image. It requires both recognizing what is in the image and expressing it in coherent text, making it a bridge between computer vision and natural language processing.
See also: Computer Vision, multimodal model, image recognition.

External reference