
NIST Releases New Guidelines for Securing AI Systems Against Adversarial Attacks
Beyond Traditional Security: Protecting Your AI Assets
NIST Releases New Guidelines for Securing AI Systems Against Adversarial Attacks
In March 2025, the National Institute of Standards and Technology (NIST) published an important update to its Adversarial Machine Learning guidelines. The new document, “NIST AI 100-2e2025: Adversarial Machine Learning: A Taxonomy and Terminology of Attacks and Mitigations,” builds on previous work and provides critical insights for organizations seeking to deploy AI systems securely.
What’s New in the 2025 Guidelines
The recently released NIST AI 100-2e2025 expands on the previous January 2024 version with several key improvements:
- An updated section on GenAI attacks and mitigation methods, restructured to reflect the latest developments in generative AI technologies and how businesses are implementing them
- A new index of attacks and mitigations for easier navigation, allowing practitioners to more efficiently find specific information about particular threats
- Addition of new authors from the U.K. AI Safety Institute and U.S. AI Safety Institute, reflecting broader international collaboration
- Expanded coverage of various attack types against both predictive and generative AI systems
Understanding AI Security Threats
The guidelines categorize AI security threats along multiple dimensions:
For Predictive AI Systems:
- Availability attacks: Attempts to break down the performance of the model
- Integrity violations: Attacks that cause incorrect predictions
- Privacy compromises: Efforts to extract sensitive information about training data
For Generative AI Systems:
- All of the above, plus:
- Abuse violations: Attempts to repurpose AI systems for harmful purposes like generating harmful content
Key Attack Vectors Covered
The document details various attack types against AI systems:
Evasion Attacks
These occur during deployment when attackers create “adversarial examples” that trick models into misclassification. For example, subtle modifications to stop signs that cause autonomous vehicles to misinterpret them as speed limit signs.
Poisoning Attacks
These target the training process, inserting malicious data that compromises model performance. The document highlights how an adversary with limited resources could potentially control a small fraction of public datasets used for model training.
Privacy Attacks
These attempt to extract sensitive information from models, including:
- Data reconstruction (inferring training data)
- Membership inference (determining if specific data was used in training)
- Model extraction (stealing model architecture and parameters)
GenAI-Specific Attacks
- Prompt injection: Manipulating inputs to bypass safety guardrails
- Indirect prompt injection: When attackers exploit resources (like websites) that will be processed by the AI system
- Supply chain attacks: Exploiting vulnerabilities in model files or training data pipelines
Real-World Implications
The update acknowledges serious real-world implications that organizations must consider:
- Scale challenges: As AI models grow, the amount of required training data increases proportionally, making it difficult to verify all sources and creating opportunities for poisoning attacks
- Theoretical limitations: Unlike cryptography, there are few information-theoretic security guarantees for machine learning algorithms
- Multimodal challenges: Contrary to expectations, multimodal models aren’t inherently more robust against attacks
- Open vs. closed model dilemmas: Public access to powerful models creates security tradeoffs
Recommended Mitigations
The guidelines outline several mitigation approaches:
- Adversarial training: Augmenting training data with adversarial examples to build resilience
- Randomized smoothing: Transforming classifiers to be certifiably robust against certain attacks
- Training data sanitization: Cleaning the training set to remove potentially poisoned samples
- Supply chain assurance: Verifying model artifacts and ensuring integrity of training data sources
- Red teaming: Testing AI systems against various attack vectors before deployment
What This Means for Your Organization
The updated guidelines underscore that AI security requires a comprehensive approach:
- AI lifecycle security: Security measures must cover the entire AI lifecycle from design to deployment
- Risk management: Organizations need to identify risks early and plan corresponding mitigation approaches
- Tradeoffs awareness: Security improvements often involve tradeoffs with model accuracy, performance, and resource requirements
- Defense in depth: No single mitigation strategy is sufficient; multiple approaches must be combined
How CinchOps Can Help Secure Your Business
As AI systems become more integral to business operations, securing them against adversarial attacks is critical. CinchOps offers specialized services to help:
- AI Security Assessments: Comprehensive evaluations of your AI systems against the threat vectors outlined in the NIST guidelines
- Supply Chain Verification: Tools and processes to verify the integrity of model files and training data
- Continuous Monitoring: Systems to detect potential attacks against your AI infrastructure
- Mitigation Implementation: Expert guidance on implementing the defense strategies recommended by NIST
The new NIST guidelines provide a valuable framework for understanding and addressing the growing threat of adversarial machine learning. As AI becomes more integral to business operations, securing these systems becomes increasingly critical.
Discover more about our enterprise-grade and business protecting cybersecurity services on our Cybersecurity page.
FREE CYBERSECURITY ASSESSMENT