Back to glossary

AI GLOSSARY

Trojan Model

Security & Adversarial AI

A machine learning model that has been deliberately compromised, typically through a backdoor attack during training, so that it behaves normally under most conditions but produces specific, attacker-controlled outputs when a hidden trigger is present in the input. A serious supply chain security concern, particularly when organizations use pre-trained models from untrusted sources without thorough security evaluation.