NIST Releases New Guidelines for Securing AI Systems Against Adversarial Attacks

In March 2025, the National Institute of Standards and Technology (NIST) published an important update to its Adversarial Machine Learning guidelines. The new document, “NIST AI 100-2e2025: Adversarial Machine Learning: A Taxonomy and Terminology of Attacks and Mitigations,” builds on previous work and provides critical insights for organizations seeking to deploy AI systems securely.

What’s New in the 2025 Guidelines

The recently released NIST AI 100-2e2025 expands on the previous January 2024 version with several key improvements:

An updated section on GenAI attacks and mitigation methods, restructured to reflect the latest developments in generative AI technologies and how businesses are implementing them
A new index of attacks and mitigations for easier navigation, allowing practitioners to more efficiently find specific information about particular threats
Addition of new authors from the U.K. AI Safety Institute and U.S. AI Safety Institute, reflecting broader international collaboration
Expanded coverage of various attack types against both predictive and generative AI systems

Understanding AI Security Threats

The guidelines categorize AI security threats along multiple dimensions:

For Predictive AI Systems:

Availability attacks: Attempts to break down the performance of the model
Integrity violations: Attacks that cause incorrect predictions
Privacy compromises: Efforts to extract sensitive information about training data

For Generative AI Systems:

All of the above, plus:
Abuse violations: Attempts to repurpose AI systems for harmful purposes like generating harmful content

Key Attack Vectors Covered

The document details various attack types against AI systems:

Evasion Attacks

These occur during deployment when attackers create “adversarial examples” that trick models into misclassification. For example, subtle modifications to stop signs that cause autonomous vehicles to misinterpret them as speed limit signs.

Poisoning Attacks

These target the training process, inserting malicious data that compromises model performance. The document highlights how an adversary with limited resources could potentially control a small fraction of public datasets used for model training.

Privacy Attacks

These attempt to extract sensitive information from models, including:

Data reconstruction (inferring training data)
Membership inference (determining if specific data was used in training)
Model extraction (stealing model architecture and parameters)

GenAI-Specific Attacks

Prompt injection: Manipulating inputs to bypass safety guardrails
Indirect prompt injection: When attackers exploit resources (like websites) that will be processed by the AI system
Supply chain attacks: Exploiting vulnerabilities in model files or training data pipelines

Real-World Implications

The update acknowledges serious real-world implications that organizations must consider:

Scale challenges: As AI models grow, the amount of required training data increases proportionally, making it difficult to verify all sources and creating opportunities for poisoning attacks
Theoretical limitations: Unlike cryptography, there are few information-theoretic security guarantees for machine learning algorithms
Multimodal challenges: Contrary to expectations, multimodal models aren’t inherently more robust against attacks
Open vs. closed model dilemmas: Public access to powerful models creates security tradeoffs

Recommended Mitigations

The guidelines outline several mitigation approaches:

Adversarial training: Augmenting training data with adversarial examples to build resilience
Randomized smoothing: Transforming classifiers to be certifiably robust against certain attacks
Training data sanitization: Cleaning the training set to remove potentially poisoned samples
Supply chain assurance: Verifying model artifacts and ensuring integrity of training data sources
Red teaming: Testing AI systems against various attack vectors before deployment

What This Means for Your Organization

The updated guidelines underscore that AI security requires a comprehensive approach:

AI lifecycle security: Security measures must cover the entire AI lifecycle from design to deployment
Risk management: Organizations need to identify risks early and plan corresponding mitigation approaches
Tradeoffs awareness: Security improvements often involve tradeoffs with model accuracy, performance, and resource requirements
Defense in depth: No single mitigation strategy is sufficient; multiple approaches must be combined

How CinchOps Can Help Secure Your Business

As AI systems become more integral to business operations, securing them against adversarial attacks is critical. CinchOps offers specialized services to help:

AI Security Assessments: Comprehensive evaluations of your AI systems against the threat vectors outlined in the NIST guidelines
Supply Chain Verification: Tools and processes to verify the integrity of model files and training data
Continuous Monitoring: Systems to detect potential attacks against your AI infrastructure
Mitigation Implementation: Expert guidance on implementing the defense strategies recommended by NIST

The new NIST guidelines provide a valuable framework for understanding and addressing the growing threat of adversarial machine learning. As AI becomes more integral to business operations, securing these systems becomes increasingly critical.

Discover more about our enterprise-grade and business protecting cybersecurity services on our Cybersecuritypage.

FREE CYBERSECURITY ASSESSMENT