Tag: interpret
-
The Register: LegalPwn: Tricking LLMs by burying badness in lawyerly fine print
Source URL: https://www.theregister.com/2025/09/01/legalpwn_ai_jailbreak/ Source: The Register Title: LegalPwn: Tricking LLMs by burying badness in lawyerly fine print Feedly Summary: Trust and believe – AI models trained to see ‘legal’ doc as super legit Researchers at security firm Pangea have discovered yet another way to trivially trick large language models (LLMs) into ignoring their guardrails. Stick…
-
Schneier on Security: We Are Still Unable to Secure LLMs from Malicious Inputs
Source URL: https://www.schneier.com/blog/archives/2025/08/we-are-still-unable-to-secure-llms-from-malicious-inputs.html Source: Schneier on Security Title: We Are Still Unable to Secure LLMs from Malicious Inputs Feedly Summary: Nice indirect prompt injection attack: Bargury’s attack starts with a poisoned document, which is shared to a potential victim’s Google Drive. (Bargury says a victim could have also uploaded a compromised file to their own…
-
Embrace The Red: Sneaking Invisible Instructions by Developers in Windsurf
Source URL: https://embracethered.com/blog/posts/2025/windsurf-sneaking-invisible-instructions-for-prompt-injection/ Source: Embrace The Red Title: Sneaking Invisible Instructions by Developers in Windsurf Feedly Summary: Imagine a malicious instruction hidden in plain sight, invisible to you but not to the AI. This is a vulnerability discovered in Windsurf Cascade, it follows invisible instructions. This means there can be instructions in a file or…