AI Governance Library

Toward Risk Thresholds for AI-Enabled Cyber Threats

This white paper proposes a probabilistic, evidence-based approach to defining AI cyber risk thresholds, arguing that current capability-based thresholds are insufficient and introducing Bayesian networks as a tool for operationalizing uncertainty in AI-enabled cyber threats.
Toward Risk Thresholds for AI-Enabled Cyber Threats

⚡ Quick Summary

This CLTC white paper tackles one of the hardest problems in AI security governance: how to know when AI-enabled cyber risk becomes intolerable. Instead of proposing yet another abstract threshold, the authors focus on methodology. They argue that current industry thresholds are fragmented, overly qualitative, and disconnected from real-world threat dynamics. Their core contribution is a structured framework using Bayesian networks to model AI-enabled cyber risk probabilistically. This allows heterogeneous evidence (benchmarks, red teaming, threat intelligence, expert judgment) to be combined, uncertainty to be made explicit, and risk estimates to be updated over time. A detailed case study on AI-augmented phishing shows how high-level concerns can be decomposed into measurable variables and recombined into actionable risk indicators. The result is not a single red line, but a governance-ready pathway for monitoring proximity to unacceptable cyber risk as AI capabilities evolve.

🧩 What’s Covered

The paper begins by mapping the current landscape of AI cyber risk thresholds used by frontier AI developers. It compares capability-based, outcome-based, and hybrid approaches, identifying common “threshold elements” such as attacker skill uplift, automation of multi-stage attacks, zero-day discovery, and infrastructure targeting. The analysis highlights recurring weaknesses: reliance on vague qualifiers, deterministic cut-offs, unclear baselines, and limited consideration of threat-actor context.

The core methodological contribution is the proposal to use Bayesian networks as a probabilistic risk modeling tool. The authors explain how Bayesian networks can represent causal relationships between AI capabilities, attacker behavior, environmental conditions, and downstream harms, while explicitly accounting for uncertainty and enabling continuous belief updating as new evidence emerges. They detail how nodes can be informed using benchmarks, red teaming results, threat intelligence, sociotechnical studies, and structured expert elicitation.

A substantial portion of the paper is devoted to a worked example: AI-augmented social engineering, with a deep dive into phishing as a risk subdomain. The authors analyze how generative AI affects phishing through realistic content generation, advanced targeting and personalization, and automated attack infrastructure. They show how qualitative insights (e.g., changes in attacker tactics) can be translated into probabilistic variables and linked to defensive factors such as detection rates and response speed. The paper also discusses validation challenges, limitations of Bayesian networks, and alternative risk assessment methods, positioning BNs as a pragmatic but not complete solution.

💡 Why it matters?

This paper moves the AI governance conversation beyond slogans like “intolerable risk” and into the mechanics of how such risk could actually be measured and monitored. For cyber risk in particular, it offers a rare bridge between AI safety discourse and established risk management practices. By treating uncertainty as a feature rather than a flaw, the Bayesian network approach aligns far better with how real-world cyber threats evolve. It provides a credible pathway for regulators, developers, and evaluators to make defensible decisions before harm scales, rather than after the fact.

❓ What’s Missing

The framework remains largely conceptual. There is no fully instantiated Bayesian network with validated probabilities, nor a concrete governance playbook that specifies how probabilistic outputs should trigger regulatory or organizational action. The societal question of where exactly to set “intolerable” thresholds is acknowledged but left unresolved. Practical adoption will require shared reference scales, sustained data collection, and agreement on how uncertainty should influence deployment decisions.

👥 Best For

AI governance professionals, AI security researchers, policymakers working on frontier AI oversight, risk management specialists, and cybersecurity leaders seeking structured ways to reason about AI-enabled threats under uncertainty.

📄 Source Details

Center for Long-Term Cybersecurity (CLTC), UC Berkeley
White Paper, January 2026
Authors: Krystal Jackson, Deepika Raman, Jessica Newman, Nada Madkour, Charlotte Yuan, Evan R. Murphy

📝 Thanks to

The authors and the CLTC AI Security Initiative for advancing a rare, methodologically serious discussion on how AI cyber risk thresholds could be made operational rather than symbolic.

About the author
Jakub Szarmach

AI Governance Library

Curated Library of AI Governance Resources

AI Governance Library

Great! You’ve successfully signed up.

Welcome back! You've successfully signed in.

You've successfully subscribed to AI Governance Library.

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.