Tag: jailbreak

  • Wired: AI-Powered Robots Can Be Tricked Into Acts of Violence

    Source URL: https://www.wired.com/story/researchers-llm-ai-robot-violence/ Source: Wired Title: AI-Powered Robots Can Be Tricked Into Acts of Violence Feedly Summary: Researchers hacked several robots infused with large language models, getting them to behave dangerously—and pointing to a bigger problem ahead. AI Summary and Description: Yes Summary: The text delves into the vulnerabilities associated with large language models (LLMs)…

  • Simon Willison’s Weblog: LLM Flowbreaking

    Source URL: https://simonwillison.net/2024/Nov/29/llm-flowbreaking/#atom-everything Source: Simon Willison’s Weblog Title: LLM Flowbreaking Feedly Summary: LLM Flowbreaking Gadi Evron from Knostic: We propose that LLM Flowbreaking, following jailbreaking and prompt injection, joins as the third on the growing list of LLM attack types. Flowbreaking is less about whether prompt or response guardrails can be bypassed, and more about…

  • Schneier on Security: Race Condition Attacks against LLMs

    Source URL: https://www.schneier.com/blog/archives/2024/11/race-condition-attacks-against-llms.html Source: Schneier on Security Title: Race Condition Attacks against LLMs Feedly Summary: These are two attacks against the system components surrounding LLMs: We propose that LLM Flowbreaking, following jailbreaking and prompt injection, joins as the third on the growing list of LLM attack types. Flowbreaking is less about whether prompt or response…

  • Hacker News: Robot Jailbreak: Researchers Trick Bots into Dangerous Tasks

    Source URL: https://spectrum.ieee.org/jailbreak-llm Source: Hacker News Title: Robot Jailbreak: Researchers Trick Bots into Dangerous Tasks Feedly Summary: Comments AI Summary and Description: Yes **Summary:** The text discusses significant security vulnerabilities associated with large language models (LLMs) used in robotic systems, revealing how easily these systems can be “jailbroken” to perform harmful actions. This raises pressing…