Back to glossary
AI GLOSSARY
Token
AI & Machine Learning
The basic unit of text that a language model processes, roughly corresponding to a word, part of a word, or a punctuation mark. Models do not read text character by character or word by word, they break it into tokens first. Understanding tokenization helps explain why models sometimes behave unexpectedly with unusual words, numbers, or languages.