Evasion Attacks on LLMs – A Checklist for LLM System Hardening

⚡ Quick Summary

This document is a concise, practitioner-oriented checklist designed to help organizations harden Large Language Model systems against evasion attacks. Published by Germany’s Federal Office for Information Security, it walks the reader through a chronological process: building foundational understanding, performing threat modeling, and implementing layered technical and organizational safeguards. Rather than promising a “silver bullet,” it is explicit about the current limits of LLM security and frames robustness as a continuous, risk-based effort. The checklist stands out for translating abstract AI security research into concrete, actionable steps that can be embedded into real system design, development, and operations.

🧩 What’s Covered

The checklist is structured into three main phases that mirror a realistic system lifecycle.

First, the Foundation section focuses on awareness and competence-building. It emphasizes understanding how LLMs work, how evasion attacks operate, and why they succeed. This includes hands-on experimentation, red teaming, and even attempting to build attacks yourself. Notably, it treats user awareness as a security control, not an afterthought.

Second, Threat Modelling dives into use-case specific risk analysis. It guides teams to clearly define the LLM’s task, inputs, outputs, and system components, then identify attack targets and impacts. A key contribution here is the explicit reference to the “Lethal Trifecta” / “Agents Rule of Two”: the dangerous combination of access to private data, exposure to untrusted input, and the ability to perform external actions. The checklist operationalizes this concept through concrete diagnostic questions rather than theory alone.

Third, LLM System Building and Hardening focuses on implementation. It encourages secure design patterns, separation of capabilities, and a multi-layered defense strategy. The document lists a rich set of countermeasures, including safety system messages, role-based prompting, human action guardrails, content stripping, sensitive information redaction, privilege minimization, and structured prompts. Importantly, it frames these as complementary controls mapped to different system layers (input, model, output, action execution). The process closes with testing, benchmarking, and red teaming as mandatory validation steps.

💡 Why it matters?

This checklist is valuable because it bridges the gap between AI security research and governance-ready engineering practice. It aligns naturally with risk management frameworks, secure-by-design principles, and regulatory expectations emerging under the EU AI Act. By treating prompt injections and jailbreaks as systemic risks rather than “prompt engineering issues,” it supports defensible design decisions, auditability, and accountability. For organizations deploying agentic or tool-enabled LLMs, it provides a clear lens to identify when innovation quietly turns into high-impact security exposure.

❓ What’s Missing

The document deliberately stays high-level, which means it does not provide implementation blueprints, code examples, or metrics for control effectiveness. There is also limited discussion of continuous monitoring, incident response, or how these controls integrate with existing SOC or ISMS processes. From a governance perspective, explicit links to legal compliance obligations, documentation duties, and organizational accountability structures could further strengthen its operational relevance.

👥 Best For

This checklist is best suited for security engineers, AI system architects, risk managers, and governance teams responsible for deploying or overseeing LLM-based systems. It is particularly useful for organizations experimenting with agents, tool-calling, or systems processing sensitive or regulated data, and for teams preparing for regulatory scrutiny under the EU AI Act.

📄 Source Details

Federal Office for Information Security (BSI), Germany. “Evasion Attacks on LLMs – A Checklist for LLM System Hardening.”

📝 Thanks to

Thanks to the Federal Office for Information Security (BSI) for translating complex LLM security research into a clear, structured, and practitioner-friendly checklist that is directly usable in real-world system design and governance.