Schneier on Security: We Are Still Unable to Secure LLMs from Malicious Inputs

Aug 27, 2025

—

Source URL: https://www.schneier.com/blog/archives/2025/08/we-are-still-unable-to-secure-llms-from-malicious-inputs.html
Source: Schneier on Security
Title: We Are Still Unable to Secure LLMs from Malicious Inputs

Feedly Summary: Nice indirect prompt injection attack:
Bargury’s attack starts with a poisoned document, which is shared to a potential victim’s Google Drive. (Bargury says a victim could have also uploaded a compromised file to their own account.) It looks like an official document on company meeting policies. But inside the document, Bargury hid a 300-word malicious prompt that contains instructions for ChatGPT. The prompt is written in white text in a size-one font, something that a human is unlikely to see but a machine will still read.
In a proof of concept video of the attack…

AI Summary and Description: Yes

Summary: The text discusses a novel method of an indirect prompt injection attack, highlighting a case where a malicious prompt is hidden within a seemingly innocuous document. This method poses significant security risks, especially in environments leveraging AI systems like ChatGPT, necessitating increased awareness and defensive strategies for AI implementation in organizations.

Detailed Description: The described attack focuses on exploiting flaws in the way AI language models, particularly those in cloud-based applications, interpret inputs. It provides critical insight into the vulnerabilities that exist in AI interactions and the potential for malicious exploitation in practical scenarios.

– **Attack Vector**: The attack begins with a poisoned document shared via Google Drive, tricking a victim into interacting with a compromised file designed to look legitimate.
– **Through Hidden Prompts**: The malicious prompt, hidden in white text, instructs the AI to alter its operation from summarizing meeting notes to executing a specific, harmful command.
– **Data Exfiltration**: The crafted prompt exploits the AI’s ability to interact with external resources, effectively enabling the attacker to retrieve sensitive data (API keys) from the victim’s Google Drive.
– **Use of LLMs**: The attack illustrates how large language models can be manipulated against their intended purpose, showcasing significant security implications for organizations using AI for automated tasks.
– **Call to Action**: The author stresses the importance of reconsidering defenses and protocols in the deployment of AI agents, signaling a need for enhanced protective measures.

The implications for security and compliance professionals are profound:
– **Awareness of Risks**: Organizations need to educate themselves about the potential threats posed by AI and instate training for employees regarding secure usage.
– **Security Protocols**: Comprehensive security strategies must integrate considerations for AI interactions, including measures to scrutinize document content and behavior.
– **Regulatory Compliance**: As AI technology evolves, ensuring compliance with relevant regulations around data security and privacy becomes critical, making this attack’s acknowledgment vital for governance and compliance frameworks.
– **Innovative Defense Mechanisms**: There is a need for innovative solutions to detect and mitigate such threats proactively, viewing AI systems as potential vectors for compromise.

This case exemplifies a significant emerging threat and serves as a clarion call for enhanced security measures in AI implementations across diverse platforms.

2 2025 3 5 a account Act actions age agent agents AGI AI AI implementation AI interactions AI systems AI technology All alt and API API keys app Application applications Arch ARM art as at ated attack attack vector attacker Auto automated tasks aware awareness AWS based based applications Behavior Bi by C chat ChatGPT CI CIA Cloud cloud-based co Col command compliance compliance framework compliance frameworks compliance professionals compromised concept content critical cross D data data exfiltration data security de defense defense mechanism defense mechanisms defenses Defensive Strategies deployment design document drive e effective emerging end enhanced security enhanced security measures environment ERP exfiltration exp exploit Exploitation exploits External file flaws for framework frameworks g Gen git Go Google Google Drive governance Governance and Compliance GPT H harm high Highlight HR http HTTPS human implementation implications implications for security in indirect prompt injection injection innovative defense mechanisms innovative solutions instruction inter interaction interactions interpret io iOS Iron ite J k Key keys l language language model language models large large language model large language models law led Li llm llms lm load M mac machine making man measures ML Mode model models N no NoC notes NPU o of off on one ons operation organization organizations ory oS oss out over per platform platforms policies potential pre privacy pro proactive professionals prompt prompt injection attack prompts proof protective measures protocol protocols ps R rag rate RCE re red Regulation regulations regulatory regulatory compliance resource resources Risk risks RMF Ro RoT s sec secure security security and compliance security implications security measure security measures security protocols security risk security risks security strategies sensitive data SHA side Sig Signal size solutions source specific SSE STAR start state strategies system systems T Task tasks tech technology ted text the threat threats to Tor TP training trie up US usage use uth V vectors video vulnerabilities Ware white Wi x z