Back to glossary

AI GLOSSARY

Self-Attention

Neural Network Architectures

A mechanism in which each element of a sequence attends to every other element in the same sequence, allowing the model to capture relationships and dependencies regardless of distance. Self-attention is the defining operation of the transformer architecture — it's what enables a language model to understand how any word in a sentence relates to any other word across the full context window simultaneously, rather than processing text in a fixed local window.
See also: attention mechanism, transformer.