Tag: schneier

  • Schneier on Security: GPT-4o-mini Falls for Psychological Manipulation

    Source URL: https://www.schneier.com/blog/archives/2025/09/gpt-4o-mini-falls-for-psychological-manipulation.html Source: Schneier on Security Title: GPT-4o-mini Falls for Psychological Manipulation Feedly Summary: Interesting experiment: To design their experiment, the University of Pennsylvania researchers tested 2024’s GPT-4o-mini model on two requests that it should ideally refuse: calling the user a jerk and giving directions for how to synthesize lidocaine. The researchers created experimental…

  • Schneier on Security: Generative AI as a Cybercrime Assistant

    Source URL: https://www.schneier.com/blog/archives/2025/09/generative-ai-as-a-cybercrime-assistant.html Source: Schneier on Security Title: Generative AI as a Cybercrime Assistant Feedly Summary: Anthropic reports on a Claude user: We recently disrupted a sophisticated cybercriminal that used Claude Code to commit large-scale theft and extortion of personal data. The actor targeted at least 17 distinct organizations, including in healthcare, the emergency services,…

  • Schneier on Security: Indirect Prompt Injection Attacks Against LLM Assistants

    Source URL: https://www.schneier.com/blog/archives/2025/09/indirect-prompt-injection-attacks-against-llm-assistants.html Source: Schneier on Security Title: Indirect Prompt Injection Attacks Against LLM Assistants Feedly Summary: Really good research on practical attacks against LLM agents. “Invitation Is All You Need! Promptware Attacks Against LLM-Powered Assistants in Production Are Practical and Dangerous” Abstract: The growing integration of LLMs into applications has introduced new security risks,…

  • Simon Willison’s Weblog: Quoting Bruce Schneier

    Source URL: https://simonwillison.net/2025/Aug/27/bruce-schneier/#atom-everything Source: Simon Willison’s Weblog Title: Quoting Bruce Schneier Feedly Summary: We simply don’t know to defend against these attacks. We have zero agentic AI systems that are secure against these attacks. Any AI that is working in an adversarial environment—and by this I mean that it may encounter untrusted training data or…