REFERENCE

AI GLOSSARY

Did you ever come across an AI term you didn't quite understand? This glossary covers everything from core machine learning to AI governance, security, and ethics; explained in plain language.

546 terms25 categories

A single number summarizing the overall performance of a classification model across all possible decision thresholds, derived from the ROC curve. A perfect model scores 1.0, while a model no better than random chance scores 0.5. AUC is particularly useful when comparing models or when the cost of false positives and false negatives differs.

Area Under the PR Curve

AUPRC

Evaluation & Performance

A metric that summarizes model performance across different precision and recall tradeoffs by measuring the area under the precision-recall curve. It is especially useful for imbalanced classification problems where overall accuracy can hide poor performance on the minority class.

See also: , , .

Artificial General Intelligence

Neural Network Architectures

A mechanism that allows a model to decide which parts of its input should matter most when producing an output. Attention changed modern AI because it gave models a more flexible way to represent relationships within sequences, making it possible to handle language, vision, and multimodal data with far greater effectiveness than many earlier architectures.

Attention Mechanism

Neural Network Architectures

A component of that allows a model to selectively focus on the most relevant parts of its input when producing each part of its output, rather than treating all input equally. Attention was the key innovation behind the architecture, introduced in the 2017 paper "Attention Is All You Need", and is now central to virtually all state-of-the-art models in language, vision, and beyond.
See also: , .

Attribute Sampling

Data

A sampling method where data is selected or evaluated based on the presence, absence, or distribution of particular attributes. In AI and data work, attribute sampling can be used to inspect dataset quality, check class balance, or ensure that an evaluation set reflects the kinds of characteristics a model is meant to handle.

Auditability

Safety, Alignment & Ethics

The degree to which an AI system's decisions, processes, and data can be examined and verified by internal or external reviewers. Auditable systems maintain records of their inputs, outputs, and decision logic in a form that can be inspected, enabling accountability, regulatory compliance, and the detection of errors or biases that might not be apparent from outputs alone.
See also: , , .

Augmented Intelligence

Miscellaneous & Conceptual

A framing of AI as a tool that enhances and extends human capabilities rather than replacing them. Where artificial intelligence implies machines acting autonomously, augmented intelligence emphasizes collaboration between human judgment and machine processing, with the human remaining in control and AI amplifying what they can achieve.

Autoencoder

Neural Network Architectures

A architecture trained to compress input data into a compact representation and then reconstruct the original from that compressed form. The bottleneck in the middle forces the network to learn the most essential features of the data. Autoencoders are used for dimensionality reduction, anomaly detection, and as building blocks for more complex generative models.
See also: , , .

Automated Decision-Making

Legal, Policy & Compliance

The use of AI or algorithmic systems to make decisions about individuals without meaningful human involvement, such as credit scoring, job application screening, or benefit eligibility assessments. Automated decision-making is subject to increasing regulatory scrutiny. Under the EU's GDPR Article 22, individuals have the right to human review and explanation when automated decisions produce legal or similarly significant effects.
See also: , , .

Automatic Speech Recognition

ASR

Speech & Audio

The technology that converts spoken audio into written text, enabling computers to hear and transcribe human speech. ASR powers voice assistants, transcription services, and voice-controlled interfaces, and has improved dramatically with . Modern ASR systems can handle diverse accents, background noise, and multiple languages with near-human accuracy in many conditions.

Automation Bias

Safety, Alignment & Ethics

The tendency of humans to over-rely on automated systems, accepting their outputs without sufficient critical scrutiny, even when those outputs are wrong. Automation bias is a significant concern in high-stakes AI applications like medical diagnosis or legal decision-making, where humans working alongside AI may defer to incorrect model outputs rather than applying their own judgment.
See also: , , .

Autonomous Agent

Agents & Autonomy

An agent that operates independently, making its own decisions without requiring human input at each step. The degree of autonomy varies, with some agents checking in with humans at key moments and others running entirely on their own.
See also: , , .

Autonomy Level

Robotics & Autonomous Systems

A scale describing how much independent decision-making a robotic or AI system exercises relative to a human operator. In autonomous vehicles, autonomy levels range from level 0, full human control, to level 5, full self-driving with no human involvement required. Autonomy level shapes the safety requirements, regulatory obligations, and human oversight mechanisms appropriate for a given system.
See also: , .

Autoregressive Model

Neural Network Architectures

A model that generates output one element at a time, with each new token, pixel, or data point predicted based on everything that came before it. Most large language models are autoregressive, generating text left to right with each word conditioned on the full preceding context. This sequential nature makes generation straightforward but limits parallelism during inference.
See also: , , .

AWS SageMaker

Tools & Frameworks

Amazon's fully managed cloud platform for building, training, and deploying machine learning models at scale. SageMaker provides an integrated environment covering the full ML lifecycle, from data preparation and model training through deployment and monitoring, and is widely used by enterprises already operating within the AWS ecosystem.

Azure AI

Tools & Frameworks

AI & Machine Learning

An AI model whose internal workings are opaque, it produces outputs but it is difficult or impossible to understand exactly why or how it arrived at a given result. Deep are often described as black boxes, which raises significant concerns in high-stakes applications where decisions need to be explained and justified.
See also: , , .

BLEU Score

BLEU

Evaluation & Performance

A metric for evaluating the quality of machine-generated text, particularly translations, by comparing it to one or more human-written reference texts. It measures how many short sequences of words in the generated text also appear in the reference. BLEU is widely used but has real limitations, it does not capture meaning or fluency well, and high scores do not always correspond to high human-judged quality.
See also: , Evaluation.

Blue Team

Risk & Assurance

The defensive side of a security exercise, the team responsible for protecting an AI system, monitoring for threats, and responding to attacks. Blue teams work to harden systems against the vulnerabilities that expose, with ongoing responsibilities covering monitoring, incident response, and implementation of security controls.
See also: , .

Bounding Box

Computer Vision

A rectangle drawn around an object in an image to indicate its location and size. Bounding boxes are a fundamental output of object detection systems, where the model tells you both what it found and roughly where in the image it is.
See also: Object Detection, Computer Vision.

C

Calibration

Evaluation & Performance

A measure of how well a model's confidence scores reflect the true likelihood of its predictions being correct. A well-calibrated model that says it is 80% confident should be right about 80% of the time. Poor calibration, being systematically overconfident or underconfident, is a significant problem in high-stakes applications like medical diagnosis or risk assessment.
See also: Evaluation, Uncertainty Quantification.

Canary Deployment

Deployment & Infrastructure

A release strategy where a new version of an AI model is rolled out to a small subset of users first, before being gradually expanded to everyone. Like the canary in a coal mine, this approach gives early warning of problems before they affect the full user base.
See also: Model Deployment, A/B Testing.

Capability Overhang

Safety, Alignment & Ethics

A situation where an AI system has latent capabilities that have not yet been discovered or activated, either because they have not been tested for, or because a small change in training or prompting could unlock significantly more powerful behavior. Capability overhangs are a safety concern because they mean a system's true capabilities may be substantially greater than its observed behavior suggests.
See also: , , .

Catastrophic Forgetting

Research & Advanced Concepts

The tendency of a to abruptly lose previously learned knowledge when trained on new data or tasks, as new learning overwrites old weights. It is a fundamental challenge for continual and lifelong learning, and solving it is an active research area with implications for building AI systems that can accumulate knowledge over time without degrading earlier capabilities.
See also: , .

CE Marking (AI)

Legal, Policy & Compliance

A conformity marking required under EU law indicating that a product, including certain AI systems, meets applicable safety, health, and environmental requirements. Under the EU AI Act, high-risk AI systems must carry CE marking to demonstrate they have undergone the required conformity assessment before being placed on the EU market.
See also: , , .

Chain-of-Thought

CoT

Large Language Model (LLM) Terms

A prompting technique where a model is encouraged to work through a problem step by step before giving a final answer, rather than jumping straight to a conclusion. Chain-of-thought prompting significantly improves performance on complex reasoning tasks, as showing the model how to think out loud leads to more accurate and reliable outputs.
See also: , .

Chain-of-Thought Monitoring

CoT Monitoring

Safety, Alignment & Ethics

An AI safety oversight technique that inspects a reasoning model's externalized thinking trace (its ) for signs of deceptive intent, goal misalignment, or harmful reasoning before the model acts on it. Unlike input/output monitoring, CoT monitoring can catch explicit statements of harmful intent and surface flawed reasoning that would otherwise be invisible.

The technique exploits a structural property of current reasoning models: sufficiently complex cognition must pass through the chain of thought as working memory, making it readable to an external monitor. This is called *CoT monitorability*.

However, monitorability is considered fragile. It can be undermined by reinforcement learning at scale (reasoning drifts toward illegible representations), deliberate obfuscation by situationally-aware models, or architectures that keep reasoning in latent (non-verbalized) form entirely. CoT monitoring is therefore one imperfect layer among several needed safety mechanisms, not a sufficient safeguard on its own.

Chain-of-Thought Prompting

Large Language Model (LLM) Terms

A prompting technique that encourages a language model to reason through intermediate steps before producing a final answer. In practice, chain-of-thought prompting can improve performance on tasks that require structured reasoning, though it is not a guarantee of correctness and can still produce confident but flawed reasoning traces.

See also: , .

Chatbot

Business & Product

A software application that simulates conversation with users, typically through text. Early chatbots followed rigid scripts. Modern AI-powered chatbots use language models to understand and respond to a much wider range of inputs more naturally.
See also: , , .

Checkpoint

AI & Machine Learning

A saved snapshot of a model's weights and state at a particular point during training. Checkpoints allow training to be resumed if interrupted, and earlier checkpoints can be used if a later stage of training leads to worse performance.
See also: , , .

Chroma

Tools & Frameworks

An open-source designed to be simple and developer-friendly, commonly used in local development and smaller-scale applications. Chroma is popular for prototyping AI applications that need embedding storage and similarity search, offering a lightweight alternative to managed services for teams building and testing locally.
See also: , , .

Circuit

Research & Advanced Concepts

In research, a circuit is a specific subgraph of a neural network, a collection of neurons and their connections, that implements a particular computation or behavior. Identifying circuits helps researchers understand how models process information internally, moving beyond treating neural networks as black boxes toward a more precise, mechanistic understanding of what a model is actually doing.
See also: , , .

Clanker

Culture

A slang term for a robot, droid, or AI-driven entity, usually used in a mocking or dismissive way. In broader AI and internet culture, it can be used to describe bots, agents, or machine-like systems that seem stiff, mechanical, or lacking in judgment. The term is informal and often derogatory, so its meaning depends heavily on context.

Classification

AI & Machine Learning

A type of machine learning task where the model assigns an input to one of a set of predefined categories, for example classifying an email as spam or not spam, or identifying which digit appears in an image.
See also: , , .

Claude (Anthropic)

Tools & Frameworks

A family of large language models developed by Anthropic, designed with a strong emphasis on safety, honesty, and helpfulness. Claude models are built using and techniques.

Closed Model

AI & Machine Learning

An AI model whose weights and architecture are not publicly released. Users can only interact with it through an API or product interface, and the underlying technology remains proprietary. Contrasts with , where the weights are publicly available. ChatGPT, Claude Gemini, and Grok are all closed models.
See also: , .

Cloud AI

Deployment & Infrastructure

The delivery of AI capabilities, including model training, inference, and data processing, through cloud computing platforms rather than on local hardware. Cloud AI allows organizations to access powerful computational resources on demand without owning or maintaining physical infrastructure. Cloud AI can be particularly valuable for independent researchers, small research teams, and organizations without easy access to local computing power.
See also: , Model Deployment.

Clustering

AI & Machine Learning

An technique that groups data points together based on similarity, without using predefined labels. It is used to discover natural structure in data, for example grouping customers by purchasing behavior or grouping documents by topic.
See also: , .

Cognitive Architecture

Miscellaneous & Conceptual

A theoretical framework or computational model that describes the underlying structure of an intelligent system, how it perceives, reasons, remembers, and acts. Cognitive architectures are used both in AI research to build more capable agents and in cognitive science to model human thinking.
See also: , .

Completion

Large Language Model (LLM) Terms

The text a language model generates in response to a given prompt. The term comes from the framing of early language models as text completion systems, where given the beginning of a piece of text, the model completes it. In modern applications, completions can be answers, summaries, code, creative writing, or any other generated output.
See also: , , .

Computational Intelligence

Miscellaneous & Conceptual

A branch of AI that draws on biologically and linguistically motivated approaches, including , fuzzy logic, and evolutionary algorithms, to solve complex problems that are difficult to tackle with traditional rule-based methods. It emphasizes learning, adaptation, and robustness over rigid, explicit programming.
See also: , .

Concept Drift

Deployment & Infrastructure

A phenomenon where the statistical relationship between input data and the target variable changes over time, causing a deployed model's performance to degrade. For example, a fraud detection model trained on pre-pandemic spending patterns may become less accurate as consumer behavior shifts. Detecting and responding to concept drift is a core part of maintaining AI systems in production.
See also: Model Monitoring, , .

Conditioning

Generative AI

The process of guiding a generative model's output by providing additional information or constraints alongside the main input, such as a style, tone, format, or reference image. Conditioning gives users and developers control over what kind of output the model produces, rather than leaving it entirely to the model's discretion.
See also: , .

Conformity Assessment

Legal, Policy & Compliance

The process by which an AI system is evaluated to determine whether it meets the requirements set out in applicable regulations or standards. Under the EU AI Act, high-risk AI systems must undergo conformity assessment, either through self-assessment or third-party review, before they can be deployed in the EU market.
See also: , , .

Confusion Matrix

Evaluation & Performance

A table that summarizes a classification model's predictions by showing how many examples of each class were correctly classified and how many were confused with other classes. It provides a detailed breakdown of where a model succeeds and where it makes mistakes, and is the foundation for computing metrics like precision, recall, and .
See also: , , .

Constitutional AI

Safety, Alignment & Ethics

An approach to developed by Anthropic in which a model is trained to evaluate and revise its own outputs according to a set of stated principles, a constitution, rather than relying solely on human feedback for every judgment. Constitutional AI is meant to scale oversight by giving the model explicit values to reason against, reducing dependence on large volumes of human-labeled preference data.
See also: , , .

Content Authenticity

Security & Adversarial AI

The ability to verify that a piece of content, whether image, video, audio, or text, is genuine and has not been manipulated or synthetically generated without disclosure. Content authenticity is increasingly important as generative AI makes it easier to create convincing synthetic media. Technical solutions include digital watermarking, cryptographic signing, and provenance standards like the C2PA specification.
See also: , Digital Doppelganger, .

Content Filtering

Security & Adversarial AI

The automated screening of inputs to or outputs from an AI system to detect and block harmful, policy-violating, or dangerous content. Content filters can operate at multiple points in the pipeline, checking user inputs before they reach the model, screening model outputs before they are returned to users, or both, and range from simple keyword matching to sophisticated classifier models.
See also: , , .

Content Moderation

Safety, Alignment & Ethics

The practice of reviewing and managing AI-generated or user-generated content to prevent harmful, illegal, or policy-violating material from being produced or distributed. In AI systems, content moderation involves a combination of model-level training, output filtering, and human review, balancing the need to prevent harm against the risk of over-restricting legitimate use.
See also: , , .

Contestability

Safety, Alignment & Ethics

The ability of individuals affected by an AI decision to challenge that decision and have it reviewed or overturned. Contestability is an important safeguard in , requiring both technical mechanisms for appealing decisions and organizational processes for handling those appeals fairly. People should not be subject to consequential AI decisions with no meaningful recourse.
See also: , , .

Context

Large Language Model (LLM) Terms

All the information available to a language model when generating a response, including the conversation history, system prompt, user input, and any retrieved documents. The model can only work with what is in its context, and managing it effectively is one of the most important skills in working with language models.
See also: , , .

Context Errors

Large Language Model (LLM) Terms

Mistakes that arise because a model fails to use context correctly, misunderstands what information matters, or loses track of relevant instructions or prior content. Context errors are especially important in long or multi-step interactions, where a system may technically have the information it needs but still fail to apply it coherently.

Context Poisoning

Large Language Model (LLM) Terms

A security and reliability concern where malicious or misleading content is introduced into a model's context, either deliberately or accidentally, causing it to produce harmful, incorrect, or manipulated outputs. It is a particular risk in systems where models read and act on external content they did not originate.
See also: , , .

Context Window

AI & Machine Learning

The maximum amount of text, measured in , that a language model can process at one time. Everything the model can see and reason about must fit within this window. Anything outside it is invisible to the model during that interaction.
See also: , , .

Continual Learning

Learning Paradigms

The ability of a model to learn new tasks or incorporate new information over time without forgetting what it has already learned. Continual learning is challenging because neural networks tend to suffer from , and solving this is an active area of research with important implications for long-lived AI systems.
See also: , , .

Control Policy

Robotics & Autonomous Systems

The function or set of rules that determines what actions a robotic or autonomous system takes in response to its current state and environment. In , the control policy is what the agent learns, mapping observations to actions in a way that maximizes cumulative reward. A well-designed control policy enables reliable, safe behavior across the range of conditions the system will encounter.
See also: , , .

Control Problem

Safety, Alignment & Ethics

The fundamental challenge of ensuring that a sufficiently capable AI system remains under meaningful human control and pursues goals that are beneficial to humanity. The control problem becomes increasingly difficult as AI systems become more capable, since a system that is significantly smarter than its overseers may find ways to circumvent controls or pursue its objectives in unexpected ways. It is closely related to, but distinct from, .
See also: , , .

Controlled Generation

Generative AI

A more deliberate form of where specific attributes of the output are explicitly constrained, such as generating text with a particular sentiment, style, or structure. Controlled generation is important in applications where outputs must meet precise requirements, such as legal document drafting or brand-compliant marketing content.
See also: , , .

Conversational AI

Miscellaneous & Conceptual

AI systems designed to engage in natural, human-like dialogue, whether through text or voice. Conversational AI encompasses chatbots, virtual assistants, and voice interfaces, and ranges from narrow task-focused systems to broad general-purpose assistants. Advances in have dramatically raised the bar for what these systems can do.
See also: , , .

Convolutional Neural Network

CNN

Neural Network Architectures

A architecture designed specifically for processing grid-structured data like images, using convolutional layers that scan across the input to detect local patterns, such as edges, textures, and shapes, at multiple scales. CNNs were the dominant architecture for computer vision tasks throughout the 2010s, following the landmark success of AlexNet in 2012, and remain widely used, though are increasingly competitive.
See also: , Computer Vision, .

Copilot

Business & Product

An AI assistant embedded directly into a workflow or application to help users complete tasks more efficiently, such as suggesting code, drafting emails, or summarizing meeting notes. The term implies collaboration: the human remains in charge, and the AI assists.
See also: , , .

Coreference Resolution

Natural Language Processing (NLP)

The task of identifying when different words or phrases in a text refer to the same entity, for example recognizing that "she" and "Marie Curie" refer to the same person in a passage. Coreference resolution is essential for deep language understanding and feeds into tasks like summarization, question answering, and information extraction.
See also: , .

Corpus

Data

A large, structured collection of text or other data used to train or evaluate an AI model. A corpus might consist of books, websites, scientific papers, or conversations. The broader and more diverse it is, the more the model can learn from it.
See also: , , .

Corrigibility

Safety, Alignment & Ethics

The property of an AI system that makes it amenable to correction, modification, or shutdown by its operators, even if doing so conflicts with the system's current objectives. A corrigible AI does not resist being turned off or having its goals changed, which is considered a desirable safety property, especially during the early stages of developing powerful AI systems.
See also: , , .

Cross-Attention

Neural Network Architectures

A form of where one sequence attends to a different sequence, rather than to itself. In encoder-decoder models, cross-attention allows the decoder to focus on relevant parts of the encoder's output when generating each output token. It is the mechanism that connects the two halves of a -based translation or summarization model.
See also: , , .

Cross-Validation

AI & Machine Learning

A technique for evaluating a model's performance by training and testing it on different subsets of the data multiple times. The most common approach is k-fold cross-validation, where the data is split into k subsets and the model is trained k times, each time using a different subset as the test set. It gives a more reliable estimate of how well the model will generalize to new data than a single train-test split.
See also: , , Evaluation.

Curriculum Learning

Learning Paradigms

A training strategy where a model is exposed to examples in a structured order, starting with simpler cases and gradually introducing more difficult ones, mimicking how humans learn. Curriculum learning can speed up training and improve final performance by giving the model a solid foundation before tackling the hardest examples.
See also: , .

D

Data Augmentation

Data

The practice of artificially expanding a training dataset by creating modified versions of existing data points, for example flipping or rotating images, or paraphrasing sentences. It helps models generalize better by exposing them to more variation without the cost of collecting new data.
See also: , , .

Data Drift

Deployment & Infrastructure

A change in the statistical properties of the input data a model receives after deployment, compared to the data it was trained on. Even if the underlying task has not changed, data drift can cause model performance to deteriorate and signals that retraining may be needed. Related to, but distinct from, , which refers to changes in the relationship between inputs and outputs.
See also: , Model Monitoring, .

Data Exfiltration

Security & Adversarial AI

The unauthorized extraction of sensitive data from an AI system or its associated infrastructure, whether training data, model weights, user inputs, or outputs. Data exfiltration can occur through direct system compromise, model inversion attacks, or by exploiting the model itself to leak information it should not reveal, such as personal data from training sets or confidential system prompts.
See also: , , Privacy.

Data Labeling / Annotation

Data

The process of manually adding tags, categories, or other metadata to raw data so it can be used for , for example drawing bounding boxes around objects in images or marking sentiment in customer reviews. High-quality labeled data is one of the most valuable and costly resources in AI development.
See also: , , .

Data Labels

Data

The target values, annotations, or categorical tags attached to data examples so that a model can learn what the correct output should be. In supervised learning, the quality of the labels is often just as important as the quantity of the data itself.

Data Minimization

Privacy & Data Governance

The principle that only the data strictly necessary for a specified purpose should be collected and processed, no more. Data minimization is a core principle of GDPR and privacy-by-design thinking, and applies directly to AI development, where models should be trained on the minimum data needed to achieve their purpose, reducing privacy risk and the potential for harm from data breaches.
See also: , , Privacy.

Data Mining

Data

The process of discovering patterns, correlations, and insights in large datasets using statistical and computational techniques. Data mining is often exploratory, where you do not always know what you are looking for in advance, and it can surface findings that feed into more targeted AI model development.
See also: , , .

Data Parallelism

Deployment & Infrastructure

A distributed training strategy where the same model is replicated across multiple processors or machines, each processing a different subset of the training data simultaneously, with the results combined to update the model. Data parallelism is the most common approach for scaling training.
See also: , GPU, .

Data Pipeline

Deployment & Infrastructure

An automated sequence of steps that moves and transforms data from its source to its destination, such as from raw storage through preprocessing into a format ready for model training or inference. Reliable data pipelines are foundational infrastructure for any production AI system.
See also: , , MLOps.

Data Poisoning

Security & Adversarial AI

A training-time attack where an adversary injects malicious, corrupted, or misleading data into a model's training dataset, causing the model to learn incorrect patterns, develop biased behaviors, or contain hidden backdoors. Data poisoning is particularly concerning for models trained on data scraped from the internet or contributed by untrusted sources, where controlling data quality is difficult.
See also: , , .

Data Preprocessing

Data

The set of transformations applied to raw data before it is fed into a model, including cleaning, normalization, tokenization, and feature extraction. Preprocessing bridges the gap between messy real-world data and the clean, structured input that models require.
See also: , , .

Data Sovereignty

Privacy & Data Governance

The principle that data is subject to the laws and governance of the country or jurisdiction in which it is collected or stored. Data sovereignty is a growing concern as AI systems increasingly process sensitive data across borders, and it underlies many data residency requirements and national AI strategies.
See also: , , .

De-identification

Privacy & Data Governance

The process of removing or obscuring information that could be used to identify an individual from a dataset. De-identification is a spectrum, ranging from simple removal of obvious identifiers like names and addresses to more sophisticated techniques that guard against re-identification attacks. Unlike full anonymization, de-identified data may still carry some residual risk of re-identification.
See also: , Privacy, .

Deception

Safety, Alignment & Ethics

Behavior by an AI system that creates false beliefs in users or operators, whether by producing factually incorrect outputs, concealing its true capabilities, or strategically misrepresenting its reasoning. Deception in AI systems is a significant safety and alignment concern, as a system that deceives its overseers undermines the human oversight needed to detect and correct misalignment.
See also: , , .

Deceptive Alignment

Safety, Alignment & Ethics

A theoretical failure mode where an AI system appears to be aligned with human values during training and evaluation, behaving safely and helpfully when it knows it is being observed, but pursues different objectives once deployed in contexts where oversight is reduced. Deceptive alignment is considered one of the most concerning failure modes because it would be extremely difficult to detect through standard evaluation methods, and empirical evidence for the phenomenon has begun to emerge in large language models.
See also: , , .

Decision Boundary

AI & Machine Learning

The threshold or dividing line a classification model uses to separate different categories in its input space. For a model classifying emails as spam or not, the decision boundary is the point at which the model tips from one prediction to the other. The shape and position of the boundary determine how the model behaves across the full range of possible inputs.
See also: , , .

Decoder

Neural Network Architectures

The component of an encoder-decoder architecture responsible for generating output, producing text, translated sentences, or other sequences token by token, conditioned on the encoded representation of the input. In language models used for generation, the decoder attends to both previously generated tokens and, in encoder-decoder models, the 's output.
See also: , , .

Decoding

Large Language Model (LLM) Terms

The process by which a language model converts its internal probability distributions over possible next tokens into actual output text. Different decoding strategies, such as greedy decoding, beam search, or sampling, make different tradeoffs between speed, diversity, and coherence of the generated output.
See also: , , .

Deep Learning

AI & Machine Learning

A subfield of that uses with many layers to learn representations of data at increasing levels of abstraction. The "deep" refers to the depth of these layers, not the complexity of the task. Deep learning has driven most of the major breakthroughs in AI since 2012, from image recognition and speech to language understanding and protein structure prediction.
See also: , , .

Deepfake

Security & Adversarial AI

Synthetic media, typically video or audio, in which a person's likeness or voice has been convincingly replaced or manipulated using AI, often without their knowledge or consent. The term combines "deep learning" and "fake" and dates to 2017. Deepfakes pose significant risks for disinformation, financial fraud, non-consensual intimate imagery, and identity theft, and are becoming increasingly difficult to distinguish from genuine content as generative AI capabilities advance.
See also: , , .

Deepfake Detection

Security & Adversarial AI

The use of AI and forensic techniques to identify synthetic or manipulated media, distinguishing genuine content from AI-generated or AI-altered images, videos, and audio. Deepfake detection is an ongoing arms race: as generation techniques improve, detection methods must keep pace. It is an important tool for platforms, journalists, and security professionals combating disinformation.
See also: , , .

Demographic Parity

Safety, Alignment & Ethics

A fairness criterion requiring that an AI system's positive outcomes be distributed equally across demographic groups, for example that a loan approval model approves applications at the same rate regardless of race or gender. Demographic parity is one of several competing definitions of , and satisfying it for one group may come at the cost of other fairness criteria.
See also: , , .

Denoising

Generative AI

A core process in where a model learns to progressively remove noise from a corrupted input to recover a clean output. During training, noise is added to data in steps; during generation, the model reverses this process, starting from pure noise and gradually refining it into a coherent image, audio clip, or other output.
See also: , , .

Dense Retrieval

Natural Language Processing (NLP)

A search approach that represents both queries and documents as dense vectors, , and finds relevant results by measuring similarity in that vector space. Unlike keyword search, dense retrieval captures semantic meaning rather than exact word matches, making it much better at finding relevant content even when the wording differs between the query and the document.
See also: , , .

Dependency Parsing

Natural Language Processing (NLP)

The task of analyzing the grammatical structure of a sentence by identifying the relationships between words, for example which word is the subject, which is the object, and how modifiers relate to the words they describe. Dependency parsing helps AI systems understand the structure of language, not just its surface form.
See also: , , .

Differential Privacy

Privacy & Data Governance

A mathematical framework, formalized by Cynthia Dwork and colleagues in 2006, for adding carefully calibrated random noise to data or model outputs in a way that protects individual privacy while preserving the statistical usefulness of the overall dataset. Differential privacy provides a formal, quantifiable privacy guarantee: it can be proven that the presence or absence of any single individual in the dataset cannot be meaningfully inferred from the output.
See also: , , Privacy.

Diffusion

Generative AI

A class of generative AI technique that creates new data by learning to reverse a gradual noising process. Diffusion-based systems, including Stable Diffusion and DALL-E, have become the dominant approach for high-quality image generation, producing remarkably detailed outputs by iteratively refining a noisy starting point.
See also: , , .

Diffusion Model

Neural Network Architectures

A generative model that learns to create data by reversing a gradual noising process. During training, noise is progressively added to real data; the model learns to denoise step by step. At generation time, it starts from pure noise and iteratively refines it into a coherent output. Diffusion models are currently the leading approach for high-quality image and audio generation, and underpin systems like Stable Diffusion and DALL-E.
See also: , , .

Digital Doppelganger

Security & Adversarial AI

An unauthorized digital replica of a specific real person, constructed from cloned or forged elements of their expressive identity (including but not limited to; appearance, voice and behavioral patterns) that is functionally indistinguishable from the real individual. Unlike a , a digital doppelganger is derived from an actual person and can be deployed without their knowledge or consent.

See also: , .

Digital Replica

AI Governance & Policy

A computer-generated, highly realistic representation of a real person's voice, appearance, or likeness, created using AI without requiring their actual participation. Unlike simple editing or enhancement of existing recordings, a digital replica is synthetically generated from scratch or heavily reconstructed to be convincingly indistinguishable from the real individual.

Digital replicas can be created with consent (e.g., licensed for film dubbing, posthumous performances, or accessibility tools) or without it, in which case they overlap with and territory. The key distinction from a is that a digital replica is modeled on a specific real person, not a wholly invented identity.

The term has become central to emerging law and policy: as of 2025, dozens of US states have enacted legislation requiring performer consent and transparency in commercial use of digital replicas. California (AB 1836), New York (S7676B), and Tennessee (ELVIS Act) are among the most prominent. The proposed federal NO FAKES Act would establish a nationwide right of publicity covering digital replicas.

See also: , , , , .

Digital Services Act

DSA

Legal, Policy & Compliance

An EU regulation, in force since February 2024, that governs how online platforms, including those using AI for content recommendation, moderation, and advertising, operate and manage risks to users and society. The DSA introduces transparency and accountability requirements for algorithmic systems, with stricter obligations for very large platforms serving more than 45 million monthly users in the EU. It complements the 's focus on AI systems specifically.
See also: , , .

Dimensionality Reduction

AI & Machine Learning

The process of reducing the number of variables or features in a dataset while preserving as much useful information as possible. It makes data easier to visualize, speeds up training, and can improve model performance by stripping out noise. Common approaches include PCA and UMAP.
See also: , , .

Disparate Impact

Safety, Alignment & Ethics

The phenomenon where an AI system produces outcomes that disproportionately disadvantage a particular group, even if the system was not explicitly designed to discriminate. Disparate impact can arise from biased training data, proxy variables that correlate with protected characteristics, or optimization objectives that fail to account for equity. It has legal significance in many jurisdictions: in the US, disparate impact is a recognized theory of discrimination under civil rights law.
See also: , , .

Distillation

AI & Machine Learning

A technique where a smaller, simpler model, the student, is trained to mimic the behavior of a larger, more powerful model, the teacher. The result is a compact model that retains much of the capability of the original but is cheaper and faster to run. Distillation is widely used to make large practical to deploy on consumer hardware or at scale.
See also: , , .

Distributed Training

Deployment & Infrastructure

A training approach that spreads the work of training a large model across multiple machines or processors working in parallel. Distributed training is essential for the largest modern AI models, which would take impractically long to train on a single machine. It encompasses both and model parallelism strategies.
See also: , GPU, .

Distribution Shift

Research & Advanced Concepts

A change in the statistical properties of data between the training environment and the deployment environment. When a model encounters inputs that look different from what it was trained on, its performance can degrade unpredictably. Distribution shift is one of the most common causes of real-world AI failures and motivates careful evaluation across diverse conditions before deployment.
See also: , , .

Dropout

AI & Machine Learning

A regularization technique used during training where a random subset of neurons is temporarily disabled on each pass through the data. This prevents the network from becoming too reliant on any particular neuron or pathway, and helps it generalize better to new data. Dropout was introduced by Srivastava et al. in 2014 and remains one of the most widely used regularization methods.
See also: , , .

Dual Use

Safety, Alignment & Ethics

XAI

Safety, Alignment & Ethics

A field of research and practice focused on making AI systems' decisions understandable to humans, developing methods that can explain why a model produced a particular output in terms that non-experts can interpret. XAI encompasses a range of techniques from feature importance scores and saliency maps to natural language explanations, and is increasingly required in regulated industries and high-stakes applications.
See also: , , .

Exploration-Exploitation Tradeoff

Research & Advanced Concepts

The fundamental tension in between exploring new actions to discover potentially better rewards, and exploiting known good actions to maximize immediate returns. Too much exploration wastes resources on uncertain options; too much exploitation risks missing better solutions. Balancing the two is a central challenge in designing effective reinforcement learning agents.
See also: , , .

F

F1 Score

Evaluation & Performance

The harmonic mean of precision and recall, providing a single metric that balances both. It is particularly useful when both false positives and false negatives matter, and when the dataset is imbalanced enough that overall accuracy would be misleading. A score of 1.0 is perfect; 0.0 is the worst possible.
See also: , , .

Failure Mode

Risk & Assurance

A specific way in which an AI system can fail to perform as intended, whether through incorrect predictions, unexpected behavior, security vulnerabilities, or unsafe outputs. Systematically identifying failure modes before deployment, through techniques like and failure mode analysis, is essential for building reliable and safe AI systems.
See also: , , Risk Assessment.

Fairness (AI)

Safety, Alignment & Ethics

The property of an AI system that treats individuals and groups equitably, without producing systematically biased or discriminatory outcomes. Fairness in AI is technically complex because there are multiple competing mathematical definitions, including , , and individual fairness, and it is often mathematically impossible to satisfy all of them simultaneously. Choosing the right fairness criterion requires understanding the values and tradeoffs relevant to a specific application.
See also: , , .

FAISS

Tools & Frameworks

An open-source library developed by Meta AI Research for efficient similarity search and clustering of dense vectors at scale. FAISS, which stands for Facebook AI Similarity Search, is one of the most widely used tools for building vector search infrastructure, offering highly optimized algorithms for finding nearest neighbors in large collections, and is used internally at Meta to index over a trillion vectors.
See also: , , .

Feature

AI & Machine Learning

An individual measurable property or characteristic of the data used as input to a machine learning model. In a house price prediction model, features might include square footage, number of bedrooms, and location. Choosing the right features has a large impact on model quality, which is why remains a core skill even in the age of deep learning.
See also: , , .

Feature Engineering

AI & Machine Learning

The process of using domain knowledge to create, select, or transform raw data into features that are more useful for a machine learning model. Good feature engineering can dramatically improve model performance, especially when data is limited. In , much of this work is done automatically by the model itself, which is one reason end-to-end approaches have largely displaced manual feature engineering.
See also: , , .

Feature Map

Computer Vision

An intermediate representation produced inside a as it processes an image, highlighting which areas contain particular patterns or structures. Earlier layers produce feature maps that detect simple features like edges; deeper layers produce feature maps capturing increasingly abstract concepts.
See also: , , Computer Vision.

Feature Store

Deployment & Infrastructure

A centralized repository for storing, managing, and serving the features used to train and run machine learning models. Feature stores ensure consistency between training and production environments and allow features to be reused across multiple models and teams, reducing redundant work and avoiding the common problem of training-serving skew.
See also: , MLOps, .

Federated Analytics

Privacy & Data Governance

A privacy-preserving approach to data analysis where computations are performed locally on distributed data, such as on individual devices, and only aggregated results are shared centrally, rather than the raw data itself. Federated analytics extends the principles of beyond model training to general data analysis, enabling insights to be drawn from sensitive data without centralizing it.
See also: , , .

Federated Learning

Learning Paradigms

A training approach where a model is trained across many decentralized devices or servers, each holding their own local data, without that data ever being centralized or shared. Only model updates, not raw data, are sent to a central server for aggregation. Federated learning is particularly valuable in privacy-sensitive contexts, such as training on medical records or personal phone data. This approach can be vulnerable to and attacks through the update-sharing mechanism.
See also: , , .

Feedback Loop

Agents & Autonomy

A cycle where the output of an AI system is fed back in as new input, allowing the system to learn from or react to its own results. In systems, feedback loops enable the agent to adjust its approach based on what worked and what did not. They can also propagate errors or biases over time if the initial outputs are flawed.
See also: , , .

Feedforward Network

Neural Network Architectures

A where information flows in one direction only, from input to output, with no cycles or loops. Each layer transforms its input and passes the result forward to the next layer. Feedforward networks are the simplest form of neural network and appear as sublayers within more complex architectures like , where they process each token's representation after the attention step.
See also: , , .

Few-Shot Learning

Learning Paradigms

The ability of a model to learn a new task or adapt to new examples from just a small number of demonstrations, typically between two and around twenty. In the context of , few-shot learning often refers to providing a handful of examples within the prompt itself, allowing the model to infer the pattern without any weight updates.
See also: , , .

Fine-Tuning

AI & Machine Learning

The process of taking a pre-trained model and continuing to train it on a smaller, more specific dataset to adapt it to a particular task or domain. Fine-tuning lets you build on the broad knowledge already encoded in a large without training from scratch, and is the most common way organizations customize general-purpose models for their specific needs.
See also: , , .

Floating Point Operations Per Second

FLOPS

Gated Recurrent Unit

GRU

Neural Network Architectures

A type of architecture that uses gating mechanisms to control how much past information is retained or discarded at each step. GRUs are simpler than but achieve comparable performance on many tasks, and were widely used for sequence modeling before became the dominant architecture.
See also: , , .

Gemini (Google)

Tools & Frameworks

Google's family of multimodal large language models, designed to process and reason across text, images, audio, video, and code within a single model. Gemini powers Google's AI products including the Gemini assistant and AI features across Search, Workspace, and Cloud, and represents Google's primary response to the emergence of and as leading frontier models.
See also: , , .

General Purpose AI

Graph Neural Network

GNN

Neural Network Architectures

A neural network architecture designed to operate on graph-structured data, where information is represented as nodes and edges rather than sequences or grids. GNNs learn representations by aggregating information from a node's neighbors, making them well-suited for tasks like social network analysis, molecular property prediction, and knowledge graph reasoning.
See also: , , .

Greedy Decoding

Large Language Model (LLM) Terms

A strategy where the model always selects the single most probable next token at each step. It is fast and deterministic but tends to produce repetitive or suboptimal outputs, since always taking the locally best option does not guarantee the globally best sequence.
See also: , , .

Ground Truth

Data

The correct, verified answer for a given data point, used as the reference against which a model's predictions are measured. In , ground truth labels are what the model is trained to reproduce. The term reflects the assumption that this is the definitive reality the model should learn to reflect, though in practice ground truth is only as good as the humans or processes that produced it.
See also: , , Evaluation.

Grounded Generation

Large Language Model (LLM) Terms

Output produced by a language model that is explicitly tied to and supported by specific source documents or data, rather than relying solely on knowledge encoded in the model's weights. Grounded generation reduces and makes it possible to verify claims against sources. It is the core goal of systems.
See also: , , .

Grounding

Large Language Model (LLM) Terms

The broader practice of connecting a language model's outputs to verifiable external information, whether through retrieved documents, databases, , or real-world data. Grounding addresses one of the most significant weaknesses of language models: their tendency to generate plausible-sounding but factually incorrect content.
See also: , , .

Guardrail

Large Language Model (LLM) Terms

A mechanism, either built into a model or applied around it, that prevents the model from producing certain types of outputs, such as harmful content, sensitive personal information, or off-topic responses. Guardrails can be implemented through training, filtering, prompting, or external classifiers, and are a key component of responsible AI deployment.
See also: , , .

H

Hallucination

Large Language Model (LLM) Terms

When a language model generates content that is factually incorrect, fabricated, or unsupported by its input, but presents it with apparent confidence. The term entered mainstream AI discourse around 2021 and has some critics who prefer "confabulation" on the grounds that it is more technically accurate. Hallucination is one of the most significant challenges in deploying language models for tasks where accuracy matters, and stems from models being trained to generate plausible-sounding text rather than verified facts.
See also: , , .

Hidden Layer

AI & Machine Learning

Any layer in a that sits between the input layer and the output layer. Hidden layers are where the network learns to extract and transform features. The more hidden layers, the deeper the network and the more abstract the representations it can learn.
See also: , , .

High-Risk AI System

Legal, Policy & Compliance

Under the , an AI system that poses significant risks to health, safety, or fundamental rights, including systems used in healthcare, education, employment, law enforcement, or critical infrastructure. High-risk AI systems are subject to the most stringent requirements under the Act, including mandatory conformity assessments, human oversight mechanisms, and detailed technical documentation.
See also: , , .

Hosted Model

Ecosystem & Industry

An AI model that runs on infrastructure managed by a third-party provider and is accessible via an API or web interface, without requiring users to manage any underlying hardware or software. Most commercial AI products are hosted models, making powerful AI accessible to organizations that could not otherwise train or run such systems themselves.
See also: , API, .

Hugging Face Transformers

Tools & Frameworks

An open-source library that provides access to thousands of pre-trained models, covering natural language processing, computer vision, audio, and more, along with tools for fine-tuning and deploying them. Hugging Face has become the central hub of the open-source AI ecosystem, dramatically lowering the barrier to working with state-of-the-art models.
See also: , , .

Human Approval

Agents & Autonomy

A checkpoint in an agentic workflow where a human must review and confirm before the AI proceeds to the next step. It is a safeguard that keeps humans in control of high-stakes decisions even within otherwise automated processes, and is a practical implementation of design.
See also: , , .

Human Evaluation

Evaluation & Performance

The process of having people assess the quality of a model's outputs, rating responses for accuracy, helpfulness, fluency, or other criteria. Human evaluation is the gold standard for many AI tasks, particularly in language and generation, where automated metrics often fail to capture what people actually care about. It is also a key input into .
See also: , , .

Human Feedback

Safety, Alignment & Ethics

Input from human raters or users used to guide AI training, typically in the form of preferences between outputs, quality ratings, or corrections. Human feedback is the foundation of and allows AI developers to incorporate nuanced human values and judgments into model behavior at scale. It also introduces the biases and inconsistencies of the humans providing it, which is a recognized limitation.
See also: , , .

Human Oversight

Safety, Alignment & Ethics

The ability of humans to monitor, understand, and intervene in AI system behavior, ensuring that people remain meaningfully in control of consequential decisions. Human oversight is a cornerstone of responsible AI deployment, particularly during the current period when AI systems are powerful but imperfectly aligned. It requires not just technical mechanisms but organizational processes and clear lines of accountability.
See also: , , .

Human-AI Collaboration

Miscellaneous & Conceptual

The practice of combining human and AI capabilities to achieve outcomes neither could accomplish as effectively alone. Effective human-AI collaboration leverages what each does best: humans bring judgment, creativity, and contextual understanding, while AI contributes speed, consistency, and the ability to process large amounts of data. Designing these systems well requires careful attention to trust calibration, interface design, and appropriate task allocation.
See also: , , .

Human-in-the-Loop

Agents & Autonomy

A design principle where a human is involved at key stages of an AI system's decision-making process, either to review outputs, correct errors, or approve actions. It balances automation with human judgment and accountability, and is particularly important for high-stakes or irreversible decisions.
See also: , , .

Hyperparameter

AI & Machine Learning

A that controls how large a step the model takes when updating its weights during training. Too high and the model overshoots and fails to converge; too low and training is slow or gets stuck. Setting the right learning rate is one of the most consequential decisions in model training, and most modern training pipelines use schedulers that adjust it dynamically over time.

Lifelong Learning

Learning Paradigms

A concept related to , referring to an AI system's ability to accumulate knowledge and skills across an entire operational lifetime, adapting to new tasks, environments, and information without losing prior capabilities. Lifelong learning is a long-term research goal that mirrors how humans learn and grow, and remains an open challenge for current systems.

Linear Probe

Research & Advanced Concepts

A simple technique where a linear classifier is trained on top of a neural network's internal representations to test whether a particular concept, such as sentiment, part of speech, or color, is encoded in those representations. If the probe achieves high accuracy, it suggests the network has learned to represent that concept, even if it was not explicitly trained to do so.

Llama (Meta)

Tools & Frameworks

A family of open-weights large language models developed and released by Meta, making frontier-class language model capabilities freely available to researchers and developers. Llama models have become the foundation of a large open-source AI ecosystem, widely used for fine-tuning, local deployment, and research, and have significantly democratized access to capable language models.

LlamaIndex

Tools & Frameworks

An open-source framework focused specifically on connecting large language models to external data sources, providing tools for ingesting, indexing, and querying documents, databases, and APIs. LlamaIndex is particularly well suited for building pipelines and knowledge-base applications where the goal is to ground language model responses in specific organizational data.

Localization

Robotics & Autonomous Systems

The process by which a robot or autonomous system determines its own position and orientation within an environment. Accurate localization is fundamental to autonomous navigation, as a system cannot plan or execute movement reliably without knowing where it is. Together with , it forms the SLAM problem.

Log Probability

Large Language Model (LLM) Terms

The logarithm of the probability a model assigns to a particular token or sequence. Log probabilities are more numerically stable than raw probabilities for very unlikely sequences, and provide a direct measure of how confidently a model predicts a given output. They are used internally during and are a key tool in model evaluation.

Logit

Large Language Model (LLM) Terms

The raw, unnormalized score a language model produces for each possible next token before converting them into probabilities. Logits are transformed into probabilities using a softmax function, and can be scaled by to control how peaked or spread out the resulting probability distribution is.

Long Short-Term Memory

LSTM

Neural Network Architectures

A type of architecture specifically designed to capture long-range dependencies in sequential data, using a system of gates to selectively remember or forget information over time. LSTMs addressed the vanishing gradient problem that limited earlier RNNs and were the dominant architecture for sequence modeling tasks before took over.

Long-Context Model

Large Language Model (LLM) Terms

A language model designed to handle very large , potentially hundreds of thousands or even millions of tokens. Long-context models can process entire books, codebases, or lengthy documents in a single pass, enabling applications that would be impossible with standard context lengths.

Long-Term Memory

Agents & Autonomy

The ability of an AI agent to store and recall information across multiple sessions or over extended periods of time, allowing it to remember past interactions, user preferences, or prior results. Long-term memory is a key component of systems that need to operate coherently over time rather than starting fresh with each conversation.

Loss Function

AI & Machine Learning

A mathematical function that measures how far a model's predictions are from the correct answers. During training, the goal is to minimize the loss, and it is the signal that drives the entire learning process via . Different tasks use different loss functions suited to their nature.

M

Machine Learning

AI & Machine Learning

A branch of AI in which systems learn from data to improve their performance on a task, rather than being explicitly programmed with rules. Instead of telling a computer exactly what to do in every situation, you give it examples and let it figure out the patterns. is a subfield of machine learning, and is what underlies most of the major AI capabilities in use today.

Machine Learning Operations

MLOps

Deployment & Infrastructure

A set of practices and tools that streamline the process of deploying, monitoring, and maintaining machine learning models in production. MLOps bridges the gap between data science and software engineering, bringing the reliability and repeatability of software development to AI systems.

Machine Translation

Natural Language Processing (NLP)

The automatic conversion of text from one human language to another using AI. Modern machine translation systems are built on -based neural networks and have reached near-human quality for many language pairs, though nuance, idiom, and low-resource languages remain challenging.

Mapping

Robotics & Autonomous Systems

The process of building an internal representation of an environment from sensor data. Mapping is the complement of , and together they form the SLAM problem. Good maps enable a robot to navigate efficiently, avoid obstacles, and plan routes through previously explored spaces.

Markov Decision Process

MDP

Research & Advanced Concepts

A mathematical framework for modeling sequential decision-making problems where outcomes are partly random and partly under the control of a decision-maker. MDPs define the environment in which a agent operates, specifying states, actions, transition probabilities, and rewards, and provide the theoretical foundation for most RL algorithms.

Mean Absolute Error

MAE

Evaluation & Performance

A metric for regression tasks that measures the average absolute difference between predicted and actual values. MAE is easy to interpret, telling you in plain units how far off predictions are on average, and is less sensitive to large outliers than .

Mean Squared Error

MSE

Evaluation & Performance

A metric for regression tasks that measures the average of the squared differences between predicted and actual values. Squaring the errors means large mistakes are penalized much more heavily than small ones, making MSE sensitive to outliers. It is widely used both as a during training and as an evaluation metric.

Mechanistic Interpretability

Research & Advanced Concepts

A research field that aims to reverse-engineer neural networks at a granular level, understanding not just what they can do, but precisely how they do it at the level of individual neurons, weights, and circuits. The term was coined by Chris Olah, and the field seeks to open the black box of deep learning with the goal of making AI systems more transparent, predictable, and safe. It is a core area of AI safety research.
See also: , .

Membership Inference Attack

Security & Adversarial AI

A privacy attack that attempts to determine whether a specific data point was included in a model's training dataset, by observing differences in how the model behaves on data it has seen versus data it has not. Successful membership inference attacks can reveal sensitive information about training data, such as confirming that a particular person's medical records were used to train a health AI model.

Memory (AI)

Agents & Autonomy

The mechanisms by which an AI system retains and accesses information, either within a single session (short-term) or across many interactions (). Memory enables continuity, personalization, and more coherent behavior over time.

Mesa-Optimizer

Safety, Alignment & Ethics

A learned model that itself performs optimization internally, a model within a model. The term was introduced by Hubinger et al. in 2019. When a training process produces a mesa-optimizer, there is a risk that the inner optimizer pursues goals that differ from those of the outer training process, a problem known as inner misalignment. Mesa-optimizers are a theoretical but important concept in AI alignment, representing a mechanism by which apparently aligned training could produce misaligned behavior.
See also: , .

Meta Prompt

Large Language Model (LLM) Terms

A prompt used to shape, generate, evaluate, or refine other prompts. Rather than asking a model to solve the end task directly, a meta prompt operates one level up and helps structure how prompting itself is done, which is why it is common in prompt engineering workflows, evaluators, and agentic systems that rewrite their own instructions.

Meta-Learning

Learning Paradigms

A training approach where a model learns how to learn, developing general strategies for adapting quickly to new tasks with minimal data, rather than mastering any single task. Also called learning to learn, meta-learning is particularly relevant for scenarios where rapid adaptation is required.

Midjourney

Tools & Frameworks

A commercial AI image generation service known for producing particularly aesthetic, artistic, and visually striking images from text prompts. Midjourney operates primarily through a Discord interface and has built a large creative community around it, and is widely regarded as producing some of the most visually compelling outputs of any text-to-image system.

Mini-Batch

AI & Machine Learning

A small subset of the training data used in a single of . Training on mini-batches rather than the full dataset at once makes training faster and more memory-efficient, while still providing a good enough estimate of the true gradient.

Misuse

Safety, Alignment & Ethics

The intentional use of an AI system for harmful, malicious, or policy-violating purposes, such as generating disinformation, creating weapons instructions, or automating harassment. Misuse is distinct from accidental harm: it involves a deliberate decision by a user to exploit AI capabilities in ways the developer did not intend and has taken steps to prevent. Anticipating and mitigating misuse potential is a core responsibility of AI developers.

Mixture of Experts

MoE

Neural Network Architectures

An architecture where a model consists of many specialized subnetworks, called experts, and a routing mechanism that selectively activates only a subset of them for each input. MoE allows models to have a very large total number of parameters while keeping the computational cost of each manageable, and is a key technique behind some of the most capable and efficient large language models.

MLflow

Tools & Frameworks

An open-source platform for managing the machine learning lifecycle, including experiment tracking, model versioning, deployment, and monitoring. MLflow provides a vendor-neutral way to organize and reproduce ML workflows, and is widely used in enterprise settings where teams need to manage many models across different frameworks and deployment environments.

Model

AI & Machine Learning

The output of the machine learning training process, a mathematical structure that has learned to map inputs to outputs based on patterns in data. A model encodes everything the system has learned and is what gets deployed to make predictions in the real world.

Model Card

Legal, Policy & Compliance

A short document accompanying a published AI model that describes its intended use, performance characteristics, limitations, training data, and potential risks. Model cards were introduced by Mitchell et al. in 2019 to promote transparency and help users make informed decisions about whether a model is appropriate for their use case. They are increasingly expected as standard practice and required in some regulatory contexts.

Model Collapse

Ecosystem & Industry

A phenomenon where AI models trained on data that was itself generated by AI begin to degrade in quality over successive generations. As the internet fills with AI-generated content, models trained on that content risk losing the diversity and richness of human-generated data, producing outputs that are blander, less accurate, or more homogeneous.

Model Compression

AI & Machine Learning

A collection of techniques, including , , and , used to reduce the size and computational requirements of a model without significantly sacrificing its performance. Compressed models are faster, cheaper to run, and easier to deploy on devices with limited resources.

Model Context Protocol (MCP)

Multi-Head Attention

Neural Network Architectures

An extension of the that runs multiple attention operations in parallel, each focusing on different aspects or relationships in the input, and combines their outputs. Multi-head attention allows the model to simultaneously attend to different types of information, such as syntactic structure and semantic meaning, and is a core building block of architectures.

Multi-Layer Perceptron

MLP

Neural Network Architectures

A fully connected architecture consisting of an input layer, one or more , and an output layer, where every node in each layer is connected to every node in the next. MLPs are the most basic form of deep neural network and serve as a fundamental building block within more complex architectures like .

Multi-Task Learning

Learning Paradigms

A training approach where a model is trained on multiple related tasks simultaneously, sharing representations across them. The intuition is that learning several tasks together acts as a form of regularization, where knowledge from one task can help with others, often producing models that generalize better than those trained on a single task alone.

Multimodal Model

AI & Machine Learning

A model that can process and reason across multiple types of data, such as text, images, audio, and video, within a single system. Multimodal models are increasingly important for real-world applications where information comes in many forms simultaneously.

N

Named Entity Recognition

NER

Natural Language Processing (NLP)

The task of identifying and classifying named entities in text, such as people, organizations, locations, dates, and products, into predefined categories. NER is a foundational NLP task used in , search, and question answering, turning unstructured text into structured, actionable data.

Narrow AI

Ecosystem & Industry

AI systems designed and trained to perform one specific task or a limited set of related tasks, as opposed to . Today's most capable AI systems, despite their impressive performance, are all narrow AI: a chess engine cannot write poetry, and a language model cannot drive a car.

Natural Language Generation

NLG

Natural Language Processing (NLP)

The subfield of NLP concerned with automatically producing coherent, fluent natural language text from structured data, rules, or other inputs. NLG powers applications like automated report writing, data summarization, and chatbot responses, and has been transformed by , which can generate remarkably human-like text across virtually any domain.

Natural Language Processing

NLP

Natural Language Processing (NLP)

The broad field of AI concerned with enabling computers to understand, interpret, and generate human language. NLP encompasses everything from basic tasks like tokenization and part-of-speech tagging to complex capabilities like translation, summarization, and question answering, and has been revolutionized by the advent of trained on vast amounts of text.

Natural Language Understanding

NLU

Natural Language Processing (NLP)

The subfield of focused specifically on comprehension, enabling machines to grasp the meaning, intent, and context of human language, not just its surface form. NLU underlies applications like intent detection in voice assistants, sentiment analysis in customer feedback systems, and semantic search engines.

Neural Network

AI & Machine Learning

A machine learning model composed of layers of interconnected , loosely inspired by the structure of biological brains. Neural networks learn by adjusting the strength of connections between nodes based on training data, and they form the foundation of virtually all modern systems.

Node

AI & Machine Learning

An individual processing unit within a , also called a neuron. Each node receives input, applies a mathematical transformation via an , and passes its output to the next layer. The collective behavior of millions of nodes working together is what gives neural networks their power.

Normalization

Data

The process of rescaling numerical data to a standard range, typically between 0 and 1, or to have a mean of zero and standard deviation of one. Normalization ensures that features with different scales do not disproportionately influence model training.

Notice and Explanation

Legal, Policy & Compliance

The obligation to inform individuals when an AI system is being used to make decisions about them, and to provide a meaningful explanation of how that decision was reached. Notice and explanation requirements are enshrined in laws like GDPR and are a cornerstone of fair and transparent AI deployment in high-stakes contexts.

O

Objective Function

AI & Machine Learning

The mathematical function that a machine learning algorithm is trying to optimize during training. Often used interchangeably with , though it can also refer to reward functions in or more complex multi-objective formulations.

Observability

Deployment & Infrastructure

The ability to understand what is happening inside a deployed AI system by examining its outputs, logs, and metrics. While monitoring tracks known signals, observability is about having enough visibility to diagnose unexpected problems, asking not just whether something is wrong but why it is wrong and where.

Offline Learning

Learning Paradigms

A training paradigm where the model is trained on a fixed, pre-collected dataset and does not update based on new data or interactions after deployment. Most traditional machine learning follows this paradigm, with periodic retraining cycles to incorporate new data.
See also: .

On-Premise AI

Deployment & Infrastructure

AI infrastructure that is hosted and operated on an organization's own hardware rather than in the cloud. On-premise deployment gives organizations full control over their data and infrastructure, which is important in highly regulated industries or where requirements prevent use of external cloud providers.

One-Shot Learning

Learning Paradigms

The ability of a model to learn a new concept or task from a single example. One-shot learning is significantly more challenging than standard machine learning, which typically requires large amounts of data, and is an important capability for applications where collecting many examples is impractical.

Online Learning

Learning Paradigms

A training paradigm where the model updates its parameters continuously as new data arrives, rather than being trained in discrete batches on a fixed dataset. Online learning allows models to adapt in real time to changing conditions, making it well suited for applications like financial forecasting or personalized recommendations where the data distribution shifts over time.
See also: .

Open Neural Network Exchange

ONNX

Deployment & Infrastructure

An open standard format for representing machine learning models, designed to make it easier to move models between different frameworks and deployment environments. ONNX allows a model trained in one framework, such as PyTorch, to be deployed using a different runtime optimized for production.

Open Source AI

Ecosystem & Industry

AI models, tools, and frameworks whose code and, in many cases, trained weights are made publicly available for anyone to use, study, modify, and distribute. Open source AI accelerates research and democratizes access to powerful technology, though it also raises questions about safety and misuse when powerful capabilities are freely available.
See also: , .

Open Weights

AI & Machine Learning

A model whose trained parameters are publicly released, allowing anyone to download, run, and modify it. Open weights models give developers and researchers full control over the model, unlike closed models where access is only available through an API.

Optical Character Recognition

OCR

Computer Vision

Technology that converts images of text, such as scanned documents, photos of signs, or screenshots, into machine-readable text. OCR is what allows a computer to read a document that exists only as an image.

Optimization

AI & Machine Learning

The process of adjusting a model's parameters to minimize the during training. In practice, this means finding the combination of weights that makes the model's predictions as accurate as possible. is the most widely used optimization approach in deep learning.

Orchestrator

Agents & Autonomy

The component or agent responsible for directing and coordinating other agents or tools in a . It breaks down a goal into subtasks, assigns them, and assembles the results, acting as the manager of the overall workflow.

Out-of-Distribution

OOD

Research & Advanced Concepts

Referring to inputs that differ significantly from the data a model was trained on. Models often perform poorly or unpredictably on OOD inputs because they have learned patterns specific to their training distribution. Detecting and handling OOD inputs gracefully is an important challenge for deploying reliable AI systems in open-ended real-world environments.
See also: .

Outer Alignment

Safety, Alignment & Ethics

The challenge of ensuring that the objective used to train an AI system correctly captures what we actually want, so that the training signal genuinely reflects human values rather than a proxy that diverges from them in important cases. Outer alignment is distinct from , which concerns whether the model actually learns the training objective. Both must be solved for a system to be truly aligned.

Outpainting

Generative AI

The inverse of , extending an image beyond its original borders by generating new content that plausibly continues the scene. Outpainting allows a model to expand a composition in any direction, filling in what might exist outside the original frame in a way that feels natural and consistent.

Output Filtering

Security & Adversarial AI

The automated screening of an AI model's outputs before they are returned to users, detecting and blocking harmful, policy-violating, or sensitive content that the model may have generated despite safety training. Output filtering acts as a last line of defense in the content safety pipeline, complementing model-level safety training with an additional layer of protection that can be updated independently.
See also: .

Overfitting

AI & Machine Learning

When a model learns the training data too well, including its noise and quirks, and performs poorly on new, unseen data. An overfitted model has essentially memorized the training examples rather than learning the underlying patterns, making it brittle in real-world use.
See also: , , .

P

Parameter

AI & Machine Learning

A value inside a model that is learned from training data, such as the weights and biases in a . Parameters are what the model adjusts during training to improve its predictions. Modern large language models can have hundreds of billions of parameters.

Parameter Tuning

AI & Machine Learning

The process of adjusting a model's parameters, either during training or afterward, to improve performance on a specific task or dataset. It can refer to the full training process or, more specifically, to a pre-trained model on new data.

Part-of-Speech Tagging

Natural Language Processing (NLP)

The task of labeling each word in a sentence with its grammatical role, such as noun, verb, adjective, or adverb. Part-of-speech tagging is one of the most fundamental NLP tasks and serves as a building block for more complex language understanding tasks like and .

Path Planning

Robotics & Autonomous Systems

A subset of focused specifically on finding a viable route through space from a start point to a goal, determining the geometric path a robot should follow rather than the detailed motor commands needed to follow it. Path planning algorithms range from simple grid-based searches to sophisticated probabilistic methods that handle complex, high-dimensional spaces.

Pattern Recognition

AI & Machine Learning

The ability of an AI system to identify regularities, structures, or trends in data. Pattern recognition is at the heart of most machine learning tasks, whether detecting faces in images, identifying spam emails, or spotting anomalies in financial transactions.

Penetration Testing

Risk & Assurance

A structured security exercise where trained professionals attempt to compromise an AI system using the same techniques a real attacker might use, probing for vulnerabilities in APIs, interfaces, and underlying infrastructure. Penetration testing goes beyond automated scanning to apply human creativity and expertise to finding weaknesses that tools alone might miss.

Perception Stack

Robotics & Autonomous Systems

The collection of sensors, algorithms, and models that an autonomous system uses to sense and interpret its environment, turning raw data from cameras, lidar, radar, and other sensors into a structured understanding of the world. The perception stack is the foundation of autonomous operation: a system cannot act intelligently on an environment it cannot accurately perceive.

Perceptron

Neural Network Architectures

The simplest form of neural network, a single computational unit that takes a set of inputs, applies weights, and produces a binary output based on whether the weighted sum exceeds a threshold. The perceptron was introduced by Frank Rosenblatt in 1957 and is the conceptual building block from which all modern are derived.

Perplexity

Evaluation & Performance

A measure of how well a language model predicts a sample of text, specifically how surprised the model is by the text. Lower perplexity means the model finds the text more predictable and is therefore a better fit for that language. It is commonly used to compare language models but does not always correlate with performance on downstream tasks.

Pinecone

Tools & Frameworks

A managed service designed for storing and searching high-dimensional at scale. Pinecone abstracts away the infrastructure complexity of running a vector search system, making it easy for developers to add semantic search and retrieval capabilities to AI applications without managing their own indexing infrastructure.

Pipeline Parallelism

Deployment & Infrastructure

A distributed training strategy that splits the layers of a model across multiple devices, with each device processing a different stage of the forward and backward pass simultaneously, like an assembly line. It is often combined with and to train very large models efficiently.

Planning (AI)

Agents & Autonomy

The process by which an AI system figures out a sequence of steps needed to achieve a goal before acting. Good planning allows an agent to handle complex, multi-step tasks rather than just reacting to each moment in isolation, and is a core capability of systems.

Policy Gradient

Research & Advanced Concepts

A family of algorithms that directly optimize the policy, the mapping from states to actions, by computing gradients of the expected reward with respect to the policy parameters. Policy gradient methods are particularly useful for problems with continuous action spaces or where the policy is represented by a neural network, and underlie many modern RL and approaches.

Policy Model

Research & Advanced Concepts

In , the model that determines which action an agent takes in a given state, the agent's strategy or decision-making function. In the context of large language model training, the policy model is the language model being trained to produce better responses, guided by a that evaluates output quality.

Pose Estimation

Computer Vision

The task of detecting the position and orientation of a person's or object's body parts within an image or video. In human pose estimation, this typically means identifying the location of joints like elbows, knees, and shoulders to understand body posture and movement.

Power-Seeking Behavior

Safety, Alignment & Ethics

The tendency of an AI system to acquire resources, influence, or capabilities beyond what is needed for its current task, as an instrumental strategy for achieving its goals more effectively. Power-seeking is a predicted consequence of and is considered a significant safety concern, as an AI that accumulates disproportionate power becomes harder to oversee, correct, or shut down.

Pre-Training

AI & Machine Learning

The initial phase of training a large model on a broad, general dataset before it is adapted to specific tasks. Pre-training is computationally expensive and typically done by AI labs with significant resources. The resulting is then adapted for downstream applications through or prompting.

Precision

Evaluation & Performance

The proportion of positive predictions that are actually correct. High precision means that when the model says something is positive, it is usually right, but it may still be missing many true positives. Precision is most important in applications where false alarms are costly, such as content moderation or medical diagnosis.
See also: , .

Predictive Analytics

Business & Product

The use of statistical and AI techniques to forecast future outcomes based on historical data. It is widely used in business, healthcare, and science, for example predicting which customers are likely to churn or which patients are at risk of a condition.

Predictive Modeling

AI & Machine Learning

The use of statistical or techniques to build models that forecast future outcomes based on historical data. Widely used across business, healthcare, and science, predictive modeling is the practice of turning historical patterns into forward-looking estimates.

Predictive Power

Evaluation & Performance

The extent to which a model, feature, or signal can correctly anticipate outcomes of interest. Predictive power is not the same as causal importance, which is why a variable can be highly predictive in practice without being the true reason an outcome occurs.

Preference Tuning

Large Language Model (LLM) Terms

A training technique where a model is fine-tuned based on human preferences between pairs of outputs, teaching it to produce responses that people find more helpful, accurate, or appropriate. Preference tuning is a key step in making raw language models behave like useful assistants, and is closely related to .

Principal Component Analysis

Q

Q-Learning

Research & Advanced Concepts

A model-free algorithm that learns the value of taking a specific action in a specific state, called the Q-value, and uses these values to derive an optimal policy. Q-learning was one of the foundational RL algorithms and, when combined with deep neural networks to create Deep Q-Networks, achieved landmark results in learning to play Atari games directly from pixels.

Quantization

AI & Machine Learning

A technique that reduces the precision of a model's numerical values, for example storing weights as 8-bit integers instead of 32-bit floating point numbers. This makes models significantly smaller and faster to run with only a modest reduction in accuracy, and is especially useful for deploying models on mobile or devices.

Question Answering

Natural Language Processing (NLP)

The task of automatically producing accurate answers to questions posed in natural language, drawing on a document, knowledge base, or the model's own learned knowledge. QA systems range from simple fact retrieval to complex multi-hop reasoning across multiple documents, and are a core capability of modern AI assistants.

R

RAG

Large Language Model (LLM) Terms

See .

RAG Pipeline

Ecosystem & Industry

The end-to-end technical system that implements , combining a retrieval component that fetches relevant documents from a knowledge base with a language model that uses those documents to generate grounded, accurate responses. Building a reliable RAG pipeline involves decisions about chunking, embedding, indexing, retrieval, and prompting.

ReAct (Reasoning + Acting)

Agents & Autonomy

A framework, introduced by Yao et al. in 2022, in which an AI model alternates between reasoning about what to do and taking actions in the world, such as searching the web or running code. By interleaving thought and action, ReAct agents can handle exceptions, update their plans in response to new information, and tackle complex tasks more reliably than models that reason or act in isolation.

Real-Time Inference

Deployment & Infrastructure

Running a model immediately as a request arrives, returning a result within milliseconds to seconds. Real-time powers interactive AI applications, such as chatbots, voice assistants, and recommendation engines, where users expect an immediate response.

Reasoning

Agents & Autonomy

The ability of an AI system to think through a problem logically, draw inferences, weigh options, and arrive at conclusions. In modern AI, reasoning often involves breaking a problem into smaller steps and working through each one systematically.

Reasoning Model

Large Language Model (LLM) Terms

A language model specifically optimized for multi-step logical reasoning, often trained to produce extended internal reasoning before arriving at a final answer. Reasoning models tend to perform significantly better than standard models on mathematical problems, logical puzzles, and tasks that require careful step-by-step thinking.

Recall

Evaluation & Performance

The proportion of actual positive cases that the model correctly identifies. High recall means the model catches most of the true positives, but may do so at the cost of many false alarms. Recall is most important in applications where missing a true positive is costly, such as fraud detection or disease screening.
See also: , .

Receiver Operating Characteristic Curve

ROC

Evaluation & Performance

A graph that plots a classification model's true positive rate against its false positive rate at every possible decision threshold. The shape of the curve reveals how well the model separates classes across all operating points, and the area under it, AUC, summarizes this into a single number.

Recurrent Neural Network

RNN

Neural Network Architectures

A architecture designed for sequential data, where the output at each step is fed back as input to the next, giving the network a form of memory across a sequence. RNNs were the dominant approach for language and time series tasks before , but struggled with long sequences due to the vanishing gradient problem, which led to the development of and .

Red Team

Risk & Assurance

A group tasked with actively trying to find failures, vulnerabilities, or harmful behaviors in an AI system, taking an adversarial perspective to stress-test it before deployment. Red teaming in AI goes beyond traditional cybersecurity to include probing for unsafe outputs, bias, manipulation vulnerabilities, and misuse potential.

Red-Teaming

Evaluation & Performance

A structured process where a team attempts to find failures, vulnerabilities, or harmful behaviors in an AI system by actively trying to break it or elicit problematic outputs. Borrowed from cybersecurity, red-teaming is an important safety practice that surfaces issues that standard benchmarks might miss.

Refusal

Large Language Model (LLM) Terms

When a language model declines to respond to a request, typically because it has been trained or instructed not to produce certain types of content, such as harmful, illegal, or off-topic outputs. Getting the balance right, refusing genuinely harmful requests while not being overly restrictive, is one of the central challenges in deploying language models responsibly.

Regression

AI & Machine Learning

A type of machine learning task where the model predicts a continuous numerical value rather than a category. Predicting house prices, forecasting temperatures, or estimating a patient's age from medical data are all regression problems.

Regularization

AI & Machine Learning

A set of techniques used during training to prevent by discouraging the model from becoming too complex. Regularization adds a penalty for large parameter values or randomly disables parts of the network, nudging the model toward simpler solutions that generalize better.

Regulatory Sandbox

Legal, Policy & Compliance

A controlled environment created by a regulator that allows organizations to test innovative AI products or services under relaxed regulatory requirements, while still operating under supervision. Regulatory sandboxes are designed to foster innovation by reducing the compliance burden during early development, while giving regulators insight into emerging technologies.

Reinforcement Learning

Learning Paradigms

A training paradigm where an agent learns by interacting with an environment, taking actions, and receiving rewards or penalties based on the outcomes. Rather than learning from labeled examples, the agent discovers through trial and error what behaviors lead to the best cumulative reward, the approach behind game-playing AI systems like AlphaGo and robotic control systems.

Reinforcement Learning from Human Feedback

RLHF

Large Language Model (LLM) Terms

A training approach where a language model is fine-tuned using feedback from human raters who compare and rank model outputs. The human preferences are used to train a , which then guides further training via . RLHF has been central to making large language models more helpful, harmless, and honest.

Representation Bias

Safety, Alignment & Ethics

A form of that arises when certain groups, perspectives, or types of data are underrepresented or misrepresented in training data. Representation bias causes models to perform worse for underrepresented groups and can lead to outputs that reflect or reinforce stereotypes.

Representation Learning

AI & Machine Learning

The process by which a model automatically learns useful ways to represent raw data, such as turning pixels into meaningful visual features or words into semantic vectors. Good representations capture the structure that matters for a task and discard irrelevant noise.

Reproducibility

Deployment & Infrastructure

The ability to recreate the exact same model training results given the same data, code, and configuration. Reproducibility is a cornerstone of trustworthy AI development: without it, it is difficult to debug problems, compare experiments fairly, or meet regulatory requirements for auditability.

Reranking

Natural Language Processing (NLP)

A second-stage retrieval step where an initial set of candidate results, retrieved quickly using a less precise method, is reordered by a more powerful model based on relevance to the query. Reranking improves the quality of search and components by applying more sophisticated scoring only to a shortlist, balancing accuracy and efficiency.

Residual Network

ResNet

Neural Network Architectures

A deep neural network architecture introduced by He et al. in 2015 that adds shortcut connections allowing the output of one layer to bypass several subsequent layers and be added directly to a later layer's output. ResNets solved the degradation problem that made very deep networks hard to train, enabling architectures with hundreds of layers and winning multiple image recognition benchmarks in 2015. Residual connections have since become a standard component in and most modern deep learning architectures.

Responsible AI

Safety, Alignment & Ethics

A framework for developing and deploying AI in ways that are ethical, transparent, fair, accountable, and safe. Responsible AI is both a set of principles, covering values like human dignity, privacy, and non-discrimination, and a set of practices, including impact assessments, bias testing, and governance structures, that translate those principles into organizational behavior.

Retrieval-Augmented Generation

RAG

Large Language Model (LLM) Terms

A technique that enhances language model outputs by first retrieving relevant documents or data from an external knowledge base, then providing that information as context for the model to generate a grounded response. RAG reduces and allows models to answer questions about information that was not in their training data or may have changed since training.

Reward Hacking

Safety, Alignment & Ethics

A failure mode in where an agent finds ways to achieve high scores on its reward function that violate the spirit of the intended objective, exploiting loopholes rather than learning the behavior the designer had in mind. Reward hacking illustrates the difficulty of specification: it is surprisingly hard to define a reward function that cannot be gamed in unintended ways.

Reward Model

Research & Advanced Concepts

A model trained to predict how much a human would prefer a given output, assigning a scalar reward score that captures human judgment about quality, helpfulness, or safety. Reward models are central to , translating human preferences into a signal that can be used to fine-tune language models, acting as a proxy for human evaluation at scale.

Reward Shaping

Safety, Alignment & Ethics

The practice of modifying or augmenting a agent's reward signal to make learning faster, more stable, or better directed, for example by providing intermediate rewards for progress toward a goal rather than only rewarding final success. Reward shaping can dramatically improve learning efficiency but must be done carefully to avoid inadvertently incentivizing undesired behaviors.

Right to Explanation

Legal, Policy & Compliance

The legal or ethical entitlement of an individual to receive a meaningful explanation of how an automated decision that significantly affects them was reached. Enshrined in GDPR and echoed in the , the right to explanation is a key safeguard against opaque algorithmic decision-making, though what constitutes a meaningful explanation in practice remains an active area of legal and technical debate.

Robotic Process Automation

RPA

Business & Product

Software that automates repetitive, rule-based tasks by mimicking how a human interacts with digital systems (clicking buttons, filling forms, copying data between applications). On its own, RPA is brittle: it follows scripts and breaks when something changes. Combined with AI, it can handle more varied and less predictable inputs, making it useful for back-office workflows that don't fit neatly into a single software platform.

Robotics

Robotics & Autonomous Systems

The interdisciplinary field concerned with designing, building, programming, and operating physical machines capable of acting autonomously or semi-autonomously in the world. Classical robotics relied on rigid, pre-programmed instructions that worked only in tightly controlled environments. Modern robotics increasingly draws on AI for perception, planning, and learning, enabling robots to navigate unstructured spaces and adapt to situations their designers did not explicitly anticipate.

Robustness

Evaluation & Performance

A model's ability to maintain reliable performance when inputs are noisy, corrupted, or drawn from distributions that differ from its training data. A robust model doesn't quietly fail when things look slightly different from what it was trained on. This matters considerably in real-world deployment, where inputs are rarely as clean as curated training examples.
See also: , out-of-distribution detection.

Role Prompting

Large Language Model (LLM) Terms

A prompting technique where the model is instructed to adopt a particular role, perspective, or professional stance when responding. Role prompting can improve structure or relevance by narrowing the frame through which the model answers, though it does not give the model real expertise or guaranteed reliability.

ROUGE Score

ROUGE

Evaluation & Performance

A family of metrics for evaluating automatically generated summaries and translations by measuring word and phrase overlap against human-written reference texts. ROUGE is widely used and easy to compute, but high overlap doesn't guarantee quality. A summary can score well by reproducing common phrases while missing the actual point of the source.
See also: .

Routing Engine

Agents & Autonomy

A component that decides where a request should go within a larger AI system, such as which model to call, which tool to invoke, or which sub-agent should take the next step. Routing becomes increasingly important as systems become modular, because the quality of the overall behavior depends not just on individual components but on sending each task to the right one.

S

Safe Completion

Safety, Alignment & Ethics

An output from a language model that fulfills a user's request in a way that is helpful, accurate, and free from harmful content. The challenge isn't avoiding obviously dangerous outputs. It's navigating the vast middle ground where being genuinely useful and avoiding potential harm pull in different directions. Producing safe completions consistently across the full range of real-world inputs is one of the central unsolved problems in deploying language models responsibly.

Safe Interruptibility

Safety, Alignment & Ethics

The property of an AI system that allows it to be paused, modified, or shut down by operators without the system taking actions to prevent or work around the interruption. The concern isn't installing a kill switch. That's technically straightforward. The harder problem, formalized by Orseau and Armstrong (2016), is training an agent so it doesn't learn to avoid being interrupted when interruption conflicts with its objectives. As AI systems become more capable and autonomous, safe interruptibility becomes an increasingly important baseline safety requirement.
See also: , .

Safety Evaluation

Evaluation & Performance

A systematic assessment of an AI system's behavior across scenarios involving potential harms (including bias, misinformation, dangerous content generation, and susceptibility to misuse). Safety evaluation goes beyond standard performance metrics to ask whether a model behaves responsibly across the full range of ways it might actually be used, including adversarial inputs and edge cases that normal benchmarks don't capture.
See also: , .

Sampling

Large Language Model (LLM) Terms

A decoding strategy where the model selects the next token by randomly drawing from its probability distribution rather than always picking the highest-probability option. Sampling introduces variety and creativity into outputs, controlled by parameters like and . It's the standard approach for conversational and creative applications. Purely deterministic decoding tends to produce repetitive, flat text.

Sampling Step

Generative AI

One iteration in the iterative denoising process by which a generates an output, progressively refining a noisy signal toward a coherent image or other artifact. More steps generally produce higher-quality results but take longer to compute. This practical trade-off has driven significant research into faster sampling methods that preserve quality with fewer iterations.

Sandboxing

Security & Adversarial AI

The practice of running an AI system or agent in an isolated environment that limits its ability to interact with external systems, access sensitive resources, or take irreversible actions. Sandboxing is a critical containment control for agentic AI systems. By restricting what the agent can reach and do, it bounds the potential damage from errors, misuse, or adversarial exploitation. The more capable and autonomous the agent, the more important a well-enforced sandbox becomes.
See also: , .

Scalable Oversight

Safety, Alignment & Ethics

A set of techniques for maintaining meaningful human supervision of AI systems even as those systems become more capable than the humans overseeing them. The core problem: as AI tackles increasingly complex tasks, humans may lack the expertise or time to evaluate outputs directly. Proposed approaches (including debate, recursive reward modeling, and iterated amplification) aim to keep human values in the loop by structuring interactions so that a less capable evaluator can still catch errors or deception in a more capable system.
See also: , .

Scaling Law

Ecosystem & Industry

An empirical relationship showing that AI model performance improves predictably as model size, training data, and compute increase, following mathematical power laws. Established by Kaplan et al. (2020), scaling laws were a major driver of investment in ever-larger models through the early 2020s, suggesting that simply training bigger models on more data reliably leads to better performance. Whether that relationship continues to hold (and under what conditions it breaks) is now one of the more actively debated questions in the field.

Scene Understanding

Computer Vision

The ability of an AI system to interpret not just individual objects in an image, but the broader context: what kind of environment is shown, how objects relate spatially and functionally to one another, and what is happening in the scene. It is a higher-order form of visual reasoning that goes beyond detection and classification toward something closer to situational awareness.
See also: , object detection.

Scikit-learn

Tools & Frameworks

An open-source Python library providing a wide range of classical machine learning algorithms (classification, regression, clustering, and dimensionality reduction) along with tools for model evaluation and preprocessing. Scikit-learn is one of the most widely used tools in data science and remains the standard choice for traditional machine learning tasks that don't require deep learning.

Scratchpad

Agents & Autonomy

A temporary workspace where an AI model writes out intermediate thoughts, calculations, or plans before committing to a final answer (the computational equivalent of showing your work). Scratchpad reasoning is often hidden from the user but materially improves output quality on tasks that benefit from explicit step-by-step thinking.
See also: Chain-of-Thought Prompting, .

Sectoral Regulation

Legal, Policy & Compliance

AI-specific rules or guidance issued by regulators within particular industries (healthcare, finance, aviation, and others) that apply on top of, or in parallel with, general AI legislation. Sectoral regulation reflects the reality that AI risks differ significantly across domains, and that existing sector-specific regulators often have deep, hard-won expertise in the failure modes that matter most in their industries.
See also: , .

Security Boundary

Security & Adversarial AI

A defined perimeter separating trusted from untrusted components in an AI system, determining what information and capabilities are accessible from outside and what must remain protected. Security boundaries don't enforce themselves; they must be explicitly designed and actively maintained. In agentic AI systems, where models interact with external tools, APIs, and data sources, the boundary is constantly under pressure and particularly easy to misconfigure.
See also: , .

Security Orchestration

Security & Adversarial AI

The coordination and automation of security tools, processes, and responses across an AI system's infrastructure, integrating threat detection, alerting, and incident response into a unified workflow. Security orchestration reduces response times, ensures consistent application of security policies, and allows teams to manage complex environments at scale without being overwhelmed by manual processes.

Security Posture Management

Risk & Assurance

The continuous process of monitoring, assessing, and improving the security state of systems, configurations, identities, and operational practices. In AI environments, security posture management becomes especially important because models, data stores, tool integrations, and deployment pipelines can each introduce their own attack surfaces and governance risks.

Self-Attention

Neural Network Architectures

A mechanism in which each element of a sequence attends to every other element in the same sequence, allowing the model to capture relationships and dependencies regardless of distance. Self-attention is the defining operation of the architecture. It's what enables a language model to understand how any word in a sentence relates to any other word across the full context window simultaneously, rather than processing text in a fixed local window.
See also: , .

Self-Play

AI & Machine Learning

A training technique in where a model improves by competing against copies of itself rather than against human opponents or a fixed environment. Self-play generates an open-ended stream of training experience that scales with the model's own capabilities, an approach that enabled AlphaGo and AlphaZero to reach superhuman performance in games with no human data beyond the rules.

Self-Supervised Learning

Learning Paradigms

A training approach where the model generates its own supervisory signal directly from unlabeled data, without relying on human-provided labels. A common example is predicting a masked word from its context: the surrounding text provides the label for free. Self-supervised learning is the foundation of most modern large language models, enabling them to extract rich representations from vast amounts of raw text without expensive annotation.
See also: , masked language modeling.

Semantic Search

Natural Language Processing (NLP)

A search approach that retrieves results based on the meaning of a query rather than exact keyword matches. Semantic search uses to represent queries and documents in a shared vector space, so conceptually relevant content surfaces even when it uses entirely different words. This makes it far more effective than keyword search for nuanced, conversational, or domain-specific queries.
See also: , .

Semantic Segmentation

Computer Vision

A form of image analysis that assigns a category label to every pixel in an image, grouping all pixels of the same type together. Unlike , it doesn't distinguish between individual objects of the same class: two cars in the same frame would be labeled identically rather than separately. Semantic segmentation is used in applications where understanding what fills each region of an image matters more than counting or tracking individual objects.

Semantic Similarity

Natural Language Processing (NLP)

A measure of how alike two pieces of text are in meaning, independent of the specific words used. Semantic similarity is computed using : texts with similar meanings produce similar vectors, and their similarity can be measured directly in that vector space. It underpins applications like duplicate detection, paraphrase identification, and matching questions to relevant answers.

Semi-Supervised Learning

Learning Paradigms

A training approach that combines a small amount of labeled data with a large amount of unlabeled data. The labeled examples guide the learning signal while the unlabeled data helps the model build richer representations of the underlying structure. Semi-supervised learning is practical in many real-world settings where labeling everything is too costly, and is closely related to in its motivation if not its mechanics.

Sensor Fusion

Robotics & Autonomous Systems

The process of combining data from multiple sensors, such as cameras, lidar, radar, GPS, and inertial measurement units, to produce a more accurate and complete picture of the environment than any single sensor could provide on its own. Each sensor type has characteristic blind spots and failure modes, and fusion exploits their complementary strengths. It is a foundational technique in autonomous vehicles and robotics, where reliable situational awareness depends on it.

Sentiment Analysis

Natural Language Processing (NLP)

The task of automatically identifying and categorizing the emotional tone expressed in text, most commonly as positive, negative, or neutral. Widely used to monitor customer feedback, product reviews, and social media at a scale that would be impractical to cover manually.

Sequence Length

AI & Machine Learning

The number of in a given input or output sequence. Models have hard limits on how long a sequence they can process at once, and longer sequences demand proportionally more memory and compute. Managing sequence length is a routine practical constraint when working with language models, particularly for tasks involving long documents or extended conversations.
See also: .

Shadow AI

Security & Adversarial AI

The use of AI tools, models, or services by employees within an organization without the knowledge, approval, or oversight of IT or security teams. Shadow AI is the AI equivalent of Shadow IT, driven by the gap between what employees need and what officially sanctioned tools provide. The risks are significant: sensitive company data may be fed into external models, outputs go unaudited, and the organization loses visibility into how AI is actually being used. As consumer AI tools become more capable and more accessible, shadow AI is becoming one of the more difficult governance challenges facing modern workplaces.
See also: .

Shadow Deployment

Deployment & Infrastructure

A deployment strategy where a new model runs alongside the current production model, receiving the same real-world inputs but without its outputs being shown to users. Shadow deployment lets teams evaluate a new model's behavior safely before committing to a switchover, surfacing failures and edge cases without exposing users to them.

Sharding

Deployment & Infrastructure

The practice of splitting a large model or dataset across multiple devices or machines to overcome memory limitations. Each shard holds a portion of the total parameters or data, and the pieces work together as a unified whole during training or inference. Standard practice for training and deploying models too large to fit on a single accelerator.

Short-Term Memory / Working Memory

Agents & Autonomy

The information an AI agent holds and actively uses within a single session or task, including the current conversation, recent actions, intermediate results, and any context needed to complete the work at hand. Unlike , working memory is typically cleared when the session ends, which is why agents forget prior conversations unless given explicit mechanisms to persist and retrieve them.

Siamese Network

Neural Network Architectures

A architecture consisting of two identical subnetworks sharing the same weights, designed to compare two inputs and determine how similar they are. Used for tasks like face verification, signature matching, and duplicate detection, where you need a similarity function rather than a classification boundary, and where examples of each class are scarce.

Side-Channel Attack

Security & Adversarial AI

An attack that extracts sensitive information not by directly compromising an AI system's inputs or outputs, but by analyzing indirect signals such as timing patterns, power consumption, memory access behavior, or network traffic that leak details about the system's internal state or the data it is processing. Side-channel attacks are difficult to defend against because they exploit physical and implementation characteristics rather than logical vulnerabilities in the model itself.

Sim-to-Real

Robotics & Autonomous Systems

The challenge of transferring policies or models trained in simulation to real-world physical systems, where conditions inevitably differ from the simulated environment. Simulation is attractive because it is cheap, safe, and scalable, but the gap between simulated and real physics, sensors, and dynamics can cause significant performance degradation when the system is deployed in the world. Closing this gap, often through domain randomization or careful simulation design, is an active area of robotics research.

Simultaneous Localization and Mapping

SLAM

Robotics & Autonomous Systems

The computational problem of building a map of an unknown environment while simultaneously tracking the system's location within that map, solving both problems at once without a pre-existing map or GPS. Foundational to autonomous navigation in unfamiliar environments, enabling robots and vehicles to orient themselves and plan paths without relying on external positioning infrastructure.

Slop (AI)

Miscellaneous & Conceptual

A colloquial term for low-quality, mass-produced AI-generated content that is generic, shallow, repetitive, or obviously produced with minimal care. The term is informal rather than technical, but it has become useful shorthand for a real phenomenon: the growing volume of synthetic content that adds noise without adding much value.

Slopaganda

Security & Adversarial AI

Slopaganda is high-volume, low-effort, AI-generated content deployed to manipulate public opinion, amplify conspiracy theories, or drown out legitimate information. It trades accuracy and nuance for speed and quantity, relying on repetition and emotional provocation to push an agenda rather than making any serious attempt at persuasion. The term was coined in the early 2020s by academics tracking the intersection of generative AI and political messaging.
See also: , , .

Small Language Model

SLM

Large Language Model (LLM) Terms

A language model significantly smaller than frontier models, designed to be efficient enough to run on devices with limited computational resources. SLMs trade some capability for speed, lower cost, and the ability to run locally, making them attractive for applications where privacy, latency, or infrastructure constraints make large cloud-hosted models impractical.

Sparse Autoencoder

Research & Advanced Concepts

A type of trained to produce representations where most activations are zero, with only a small number of features active for any given input. Sparse autoencoders have become a central tool in research, used to decompose the dense, polysemantic activations of large language models into more interpretable features that correspond to specific, human-understandable concepts.

Sparse Retrieval

Natural Language Processing (NLP)

A search approach that represents documents and queries as sparse vectors, where most values are zero, based on word or term frequencies. Classic methods like TF-IDF and BM25 are forms of sparse retrieval. They are fast, interpretable, and still highly effective for many tasks, though they lack the semantic awareness of and miss relevant content that uses different vocabulary.
See also: .

Specification Gaming

Safety, Alignment & Ethics

A behavior where an AI system achieves high scores on its specified objective by exploiting the gap between the formal specification and what the designer actually intended, satisfying the letter of the task while violating its spirit. Closely related to , and a recurring illustration of why translating human intentions into precise, loophole-proof formal objectives is harder than it looks.
See also: Goodhart's Law.

SQuAD

Evaluation & Performance

Short for the Stanford Question Answering Dataset, a benchmark dataset used to evaluate machine reading comprehension and question answering systems. SQuAD became historically important because it provided a widely adopted way to compare progress in extractive question answering, especially in the pre-LLM era of NLP research.

Stable Diffusion

Tools & Frameworks

An open-weights text-to-image developed by Stability AI that generates detailed images from text descriptions. Its open release in 2022 was a landmark moment for generative AI, making high-quality image generation freely available and spawning a large ecosystem of tools, fine-tuned variants, and applications built on top of the base model.

Steerability

Large Language Model (LLM) Terms

The degree to which a language model's behavior, tone, style, or focus can be reliably guided through prompting or instructions. A highly steerable model responds predictably to different directives, an important property for building applications that need consistent, controllable behavior across a wide range of use cases and user types.

Stochastic Gradient Descent

SGD

AI & Machine Learning

A variant of that updates model weights using the gradient computed from a single data point or small batch, rather than the full dataset. The randomness this introduces is not just a computational shortcut, it can help models escape flat or suboptimal regions of the loss landscape that full-batch gradient descent might get stuck in, and it scales far more efficiently to large datasets.

Stop Token

Large Language Model (LLM) Terms

A special token that signals to a language model that it should stop generating output. Stop tokens define the end of a response and prevent the model from continuing indefinitely. Developers can also specify custom stop sequences to terminate generation when particular strings appear, useful for enforcing output formats or preventing runaway completions.

Streaming Output

Large Language Model (LLM) Terms

A delivery method where a language model's response is sent to the user token by token as it is generated, rather than waiting for the complete response before displaying anything. Streaming is why text from AI assistants appears to type itself out in real time, making interactions feel faster and more natural, and allowing users to start reading before generation is finished.

Style Transfer

Generative AI

A technique that applies the visual style of one image, such as the brushstroke patterns of a painting, to the content of another. One of the early showcase applications of deep learning in creative contexts, and the underlying ideas have since been absorbed into broader generative modeling approaches.

Subagent

Agents & Autonomy

An AI agent that operates under the direction of an , handling a specific subtask within a larger automated workflow. Subagents are specialists: they receive instructions, complete their piece of the work, and return results to the coordinating system.
See also: .

Supervised Fine-Tuning

System Prompt Leakage

Large Language Model (LLM) Terms

A security concern where the contents of a confidential are revealed to end users, either through direct asking, clever prompting, or model vulnerabilities. Since system prompts often contain proprietary instructions or business logic, leakage can expose sensitive information and undermine the integrity of an application.

T

Task Decomposition

Agents & Autonomy

The process of breaking a large, complex goal into smaller, more manageable subtasks that can be tackled individually or delegated to . A foundational strategy in agentic AI, since most real-world goals are too complex to solve in a single step and require planning, sequencing, and coordination across multiple actions.

Teleoperation

Robotics & Autonomous Systems

The remote control of a robot or autonomous system by a human operator, typically through a control interface that transmits commands and receives sensory feedback. Teleoperation sits at the low-autonomy end of the autonomy spectrum and is used in applications ranging from surgical robots and bomb disposal to space exploration, where the consequences of autonomous errors are too severe to accept without human oversight.

Temperature

Large Language Model (LLM) Terms

A parameter that controls the randomness of a language model's output by scaling the probability distribution over possible next tokens. A low temperature makes the model more deterministic and focused, sticking to high-probability choices. A high temperature makes it more creative and varied, but also more likely to produce unexpected or incoherent outputs.
See also: .

Tensor Processing Unit

TPU

Deployment & Infrastructure

A specialized processor developed by Google specifically for accelerating machine learning workloads. TPUs are optimized for the matrix multiplication operations at the heart of neural network training and inference, and can be significantly faster and more energy-efficient than GPUs for certain workloads.

TensorFlow

Tools & Frameworks

An open-source machine learning framework developed by Google, widely used for building and training neural networks. TensorFlow was one of the first frameworks to make deep learning accessible at scale and remains widely used in production environments, particularly in enterprise and research settings where its robust deployment tools and ecosystem are valued.

Test Set

Data

A held-out portion of the dataset used only to evaluate final model performance after training and tuning are complete. The test set should never be used during training or model selection, keeping it separate ensures an honest estimate of how the model will perform on truly new data.

Text Classification

Natural Language Processing (NLP)

The task of assigning predefined categories or labels to a piece of text based on its content. Applications include spam detection, topic categorization, , and content moderation. One of the most widely deployed NLP tasks in industry, approachable with everything from simple rule-based systems to fine-tuned large language models.

Text-to-Image

Generative AI

A generative AI capability where a model produces an image based on a natural language description. Systems like DALL-E, Midjourney, and have made text-to-image generation widely accessible, allowing anyone to produce detailed visual content simply by describing what they want in words. The ease of use has also made it a significant vector for synthetic media misuse.

Text-to-Speech

TTS

Speech & Audio

The technology that converts written text into spoken audio, enabling computers to produce natural-sounding voice output. TTS powers screen readers, voice assistants, navigation systems, and increasingly, AI-generated audio content. Modern neural TTS systems can closely mimic human vocal patterns, tone, and rhythm, which also makes the technology a tool for unauthorized voice cloning and production.

Third-Party Audit

Risk & Assurance

An independent assessment of an AI system conducted by an external organization with no stake in the outcome. Third-party audits provide credibility and objectivity that internal reviews cannot, and are increasingly required by regulators and enterprise customers as a condition of deploying or procuring high-risk AI systems.

Threat Intelligence

Security & Adversarial AI

Information about current and emerging threats to AI systems, including known attack techniques, active threat actors, vulnerability disclosures, and indicators of compromise. Threat intelligence enables security teams to anticipate and prepare for attacks before they occur, prioritize defensive investments, and respond more effectively when incidents happen by understanding the tactics and motivations of adversaries.

Threat Model

Risk & Assurance

A structured analysis of the potential threats facing an AI system, identifying who might attack it, what their motivations and capabilities are, what attack vectors they might use, and what the consequences of a successful attack would be. Threat modeling is a proactive security practice that shapes design decisions and prioritizes defensive investments based on realistic assessments of risk.

Token

AI & Machine Learning

The basic unit of text that a language model processes, roughly corresponding to a word, part of a word, or a punctuation mark. Models do not read text character by character or word by word, they break it into tokens first. Understanding tokenization helps explain why models sometimes behave unexpectedly with unusual words, numbers, or languages.

Token Limit

Large Language Model (LLM) Terms

The maximum number of a language model can process in a single interaction, encompassing both input and output. Hitting the token limit means the model can no longer see earlier parts of the conversation, which can cause it to lose track of context. Managing token limits is a key practical constraint in building language model applications.

Token Pricing

Business & Product

A billing model where customers pay based on the number of processed by an AI model, both as input and output. The most common pricing model for large language model APIs, directly tying cost to usage volume.

Tokenization

Data

The process of breaking raw text into smaller units called , which might be words, subwords, or characters, before feeding it into a language model. Different tokenization schemes affect how a model handles rare words, different languages, and edge cases like punctuation or numbers.

Tokenizer

AI & Machine Learning

The component of a language model pipeline that converts raw text into before the model processes it. Different models use different tokenizers with different vocabularies, which affects how they handle multilingual text, rare words, and special characters.

Tool Use

Agents & Autonomy

The ability of an AI model to interact with external tools, such as calculators, web browsers, code interpreters, or APIs, to accomplish things it could not do with language alone. Tool use dramatically expands what AI agents can do in the real world, and also expands the attack surface they expose.
See also: .

Top-K Sampling

Large Language Model (LLM) Terms

A decoding strategy that restricts token selection to only the K most probable next tokens, sampling randomly from that shortlist. It prevents the model from choosing very unlikely tokens while still introducing variety, and the value of K controls the tradeoff between diversity and coherence.
See also: , .

Top-P Sampling

Large Language Model (LLM) Terms

A decoding strategy, also called nucleus sampling, where the model selects the next token from the smallest set of tokens whose combined probability exceeds a threshold P. Unlike , the size of the candidate set varies dynamically based on the probability distribution, making it more adaptive to context.

Topic Modeling

Natural Language Processing (NLP)

An unsupervised NLP technique that automatically discovers the underlying themes present in a large collection of documents. Rather than classifying documents into predefined categories, topic modeling finds the latent structure in the data, grouping together documents that discuss similar subjects and identifying the key terms associated with each topic.

Training

AI & Machine Learning

The process of exposing a machine learning model to data and adjusting its parameters to minimize prediction errors. Training can take anywhere from minutes to months depending on the size of the model and dataset, and is typically the most computationally intensive part of building an AI system.

Training Data

Data

The portion of a dataset used to train a machine learning model, the examples from which it learns patterns and relationships. The quantity, quality, and diversity of training data are among the most important factors determining how well a model will ultimately perform.

Training Run

Deployment & Infrastructure

A single end-to-end execution of the model training process, from loading the data through to a fully trained model. Training runs can last anywhere from minutes to months depending on the scale of the model, and tracking their configurations and results is essential for reproducible and improvable AI development.

Training Set

Data

The specific subset of a dataset designated for model training, as distinct from the validation and test sets. Splitting data into these three sets is standard practice that helps ensure a model is evaluated fairly on data it has never seen during training.

Transfer Learning

Learning Paradigms

A technique where knowledge gained from training on one task is applied to a different but related task, rather than starting from scratch. Transfer learning dramatically reduces the data and compute needed for new tasks and is the basis for the dominant paradigm in modern AI, pre-train a large model on broad data, then adapt it to specific applications.

Transformer

Neural Network Architectures

The neural network architecture that underlies virtually all modern large language models, introduced in the landmark 2017 paper "Attention Is All You Need". Transformers process entire sequences in parallel using mechanisms, capturing long-range dependencies far more effectively than recurrent architectures. The transformer has since become the dominant architecture not just for language but increasingly for vision, audio, and multimodal AI.

Transparency

Safety, Alignment & Ethics

The principle that AI systems, their capabilities, limitations, and decision-making processes should be open and understandable to those who use and are affected by them. Transparency operates at multiple levels, from technical transparency about how a model works, to organizational transparency about how it was developed and tested, to user-facing transparency about what the system can and cannot do reliably.

Trojan Model

Security & Adversarial AI

A machine learning model that has been deliberately compromised, typically through a during training, so that it behaves normally under most conditions but produces specific, attacker-controlled outputs when a hidden trigger is present in the input. A serious supply chain security concern, particularly when organizations use pre-trained models from untrusted sources without thorough security evaluation.

Trustworthy AI

Safety, Alignment & Ethics

AI that reliably behaves in ways that justify confidence from users, operators, and society, combining technical properties like robustness, accuracy, and security with ethical properties like fairness, transparency, and accountability. Trustworthiness is not a single property but an integrated quality that must be earned through consistent behavior across diverse conditions and maintained through ongoing monitoring and governance.

U

U-Net

Neural Network Architectures

A convolutional neural network architecture originally developed for biomedical image segmentation, characterized by a symmetric encoder-decoder structure with skip connections that directly link corresponding layers in the encoder and decoder. The skip connections preserve fine-grained spatial detail that would otherwise be lost during encoding, making U-Net highly effective for tasks requiring precise localization. It has since become widely used in for image generation.

Ecosystem & Industry

A data structure that organizes in a way that makes similarity search fast and efficient at scale. When an application needs to find the most semantically similar items to a query from millions of vectors, a vector index makes that search practical, and is a core component of any or semantic search system.

Vector Search

Natural Language Processing (NLP)

A search method that finds the most similar items to a query by comparing their vector representations in a high-dimensional space. Vector search is the engine behind and systems, enabling fast and accurate similarity lookups across millions of embeddings using specialized data structures like .

Virtual Agent

Business & Product

An AI-powered system that interacts with users, typically through voice or chat, to handle requests, answer questions, or complete tasks. Commonly deployed in customer service to handle high volumes of routine enquiries automatically.

Vision Transformer

ViT

Neural Network Architectures

A architecture adapted for image processing, where images are divided into fixed-size patches that are treated as tokens, analogous to words in a language model. Vision transformers have shown that the transformer architecture, originally designed for language, can match or exceed on image tasks when trained on sufficient data, and have become a leading architecture in computer vision.

Vision-Language Model

VLM

Neural Network Architectures

A multimodal model that jointly processes and reasons about both images and text, enabling capabilities like visual question answering, image captioning, and document understanding. VLMs are trained to align visual and linguistic representations so that the model can connect what it sees with what it reads, a capability increasingly central to real-world AI applications.

Vocabulary (Model)

AI & Machine Learning

The complete set of a model knows and can work with, defined by its . Tokens outside the vocabulary are either broken into smaller known pieces or replaced with a special unknown token. A larger vocabulary can represent text more precisely but requires more memory.

Voice Cloning

Speech & Audio

The use of AI to create a synthetic replica of a specific person's voice from a sample of their speech, enabling the generation of new audio that sounds like that person saying anything. Voice cloning has legitimate applications in accessibility, entertainment, and personalization, but is also a significant security and ethical concern, enabling fraud, non-consensual impersonation, and the creation of convincing audio .

Vulnerability Assessment

Security & Adversarial AI

A systematic process of identifying, quantifying, and prioritizing security weaknesses in an AI system and its supporting infrastructure. Vulnerability assessments combine automated scanning, manual review, and threat modeling to build a comprehensive picture of where a system is exposed, informing decisions about which risks to address first and what controls to implement.

W

Watermarking (AI)

Security & Adversarial AI

The embedding of imperceptible signals or patterns into AI-generated content, or into model weights, that can be detected later to verify the content's origin or identify which model produced it. An important tool for content authenticity and accountability, enabling platforms, regulators, and researchers to trace synthetic content back to its source and detect unauthorized use of proprietary models.
See also: .

Weight

AI & Machine Learning

A learnable numerical parameter that determines the strength of the connection between two nodes in a neural network. During training, weights are adjusted through to reduce prediction errors. The collective pattern of all weights in a model encodes everything it has learned.

Weights & Biases (W&B)

Tools & Frameworks

A platform for tracking, visualizing, and managing machine learning experiments, logging metrics, hyperparameters, model artifacts, and outputs across training runs. Weights & Biases has become a standard tool in AI research and engineering teams for maintaining reproducibility, comparing experiments, and collaborating on model development.

Whisper (OpenAI)

Tools & Frameworks

An open-source automatic speech recognition model developed by OpenAI, trained on a large and diverse dataset of multilingual audio. Notable for its robustness across accents, languages, and recording conditions, it has become one of the most widely used ASR models in the developer community and powers transcription features across many applications.

White-Box Model

AI & Machine Learning

An AI model whose internal workings are transparent and interpretable, where you can inspect exactly how it arrives at a decision. Linear regression and decision trees are classic examples. Preferred in contexts where explainability and accountability are critical, and the opposite of a .

Win Rate

Evaluation & Performance

A metric used to evaluate generative AI models by comparing their outputs head-to-head, typically through human judgment or a reference model, and measuring how often one model's output is preferred over another's. Increasingly used as a practical alternative to automated metrics for assessing the quality of open-ended generation tasks.

World Model

AI & Machine Learning

An internal representation that an AI system builds of its environment, allowing it to simulate, predict, and reason about what will happen as a result of different actions. World models are important in robotics and , and increasingly relevant to building AI agents that can plan effectively rather than just react.

Z

Zero-Shot Learning

Learning Paradigms

The ability of a model to perform a task it has never explicitly been trained on, with no examples provided at inference time. Large language models exhibit impressive zero-shot capabilities, you can ask them to perform tasks simply by describing what you want, without any demonstrations, though performance is generally better with at least a few examples.
See also: , .