Back to glossary

AI GLOSSARY

Token

AI & Machine Learning

The basic unit of text that a language model processes, roughly corresponding to a word, part of a word, or a punctuation mark. Models do not read text character by character or word by word, they break it into tokens first. Understanding tokenization helps explain why models sometimes behave unexpectedly with unusual words, numbers, or languages.