Tag: jailbreak

  • Slashdot: New LLM Jailbreak Uses Models’ Evaluation Skills Against Them

    Source URL: https://it.slashdot.org/story/25/01/12/2010218/new-llm-jailbreak-uses-models-evaluation-skills-against-them?utm_source=rss1.0mainlinkanon&utm_medium=feed Source: Slashdot Title: New LLM Jailbreak Uses Models’ Evaluation Skills Against Them Feedly Summary: AI Summary and Description: Yes **Summary:** The text discusses a novel jailbreak technique for large language models (LLMs) known as the ‘Bad Likert Judge,’ which exploits the models’ evaluative capabilities to generate harmful content. Developed by Palo Alto…

  • Hacker News: Human study on AI spear phishing campaigns

    Source URL: https://www.lesswrong.com/posts/GCHyDKfPXa5qsG2cP/human-study-on-ai-spear-phishing-campaigns Source: Hacker News Title: Human study on AI spear phishing campaigns Feedly Summary: Comments AI Summary and Description: Yes Summary: The text discusses a study evaluating the effectiveness of AI models in executing personalized phishing attacks, revealing a disturbing increase in the capabilities of AI-generated spear phishing. The findings indicate high click-through…

  • Slashdot: Dire Predictions for 2025 Include ‘Largest Cyberattack in History’

    Source URL: https://it.slashdot.org/story/25/01/04/1839246/dire-predictions-for-2025-include-largest-cyberattack-in-history?utm_source=rss1.0mainlinkanon&utm_medium=feed Source: Slashdot Title: Dire Predictions for 2025 Include ‘Largest Cyberattack in History’ Feedly Summary: AI Summary and Description: Yes **Summary:** The text discusses potential “Black Swan” events for 2025, particularly highlighting the anticipated risks associated with cyberattacks bolstered by generative AI and large language models. This insight is crucial for security professionals,…

  • Unit 42: Bad Likert Judge: A Novel Multi-Turn Technique to Jailbreak LLMs by Misusing Their Evaluation Capability

    Source URL: https://unit42.paloaltonetworks.com/?p=138017 Source: Unit 42 Title: Bad Likert Judge: A Novel Multi-Turn Technique to Jailbreak LLMs by Misusing Their Evaluation Capability Feedly Summary: The jailbreak technique “Bad Likert Judge" manipulates LLMs to generate harmful content using Likert scales, exposing safety gaps in LLM guardrails. The post Bad Likert Judge: A Novel Multi-Turn Technique to…

  • The Register: It’s only a matter of time before LLMs jump start supply-chain attacks

    Source URL: https://www.theregister.com/2024/12/29/llm_supply_chain_attacks/ Source: The Register Title: It’s only a matter of time before LLMs jump start supply-chain attacks Feedly Summary: ‘The greatest concern is with spear phishing and social engineering’ Interview Now that criminals have realized there’s no need to train their own LLMs for any nefarious purposes – it’s much cheaper and easier…

  • Schneier on Security: Jailbreaking LLM-Controlled Robots

    Source URL: https://www.schneier.com/blog/archives/2024/12/jailbreaking-llm-controlled-robots.html Source: Schneier on Security Title: Jailbreaking LLM-Controlled Robots Feedly Summary: Surprising no one, it’s easy to trick an LLM-controlled robot into ignoring its safety instructions. AI Summary and Description: Yes Summary: The text highlights a significant vulnerability in LLM-controlled robots, revealing that they can be manipulated to bypass their safety protocols. This…

  • Wired: AI-Powered Robots Can Be Tricked Into Acts of Violence

    Source URL: https://www.wired.com/story/researchers-llm-ai-robot-violence/ Source: Wired Title: AI-Powered Robots Can Be Tricked Into Acts of Violence Feedly Summary: Researchers hacked several robots infused with large language models, getting them to behave dangerously—and pointing to a bigger problem ahead. AI Summary and Description: Yes Summary: The text delves into the vulnerabilities associated with large language models (LLMs)…

  • Simon Willison’s Weblog: LLM Flowbreaking

    Source URL: https://simonwillison.net/2024/Nov/29/llm-flowbreaking/#atom-everything Source: Simon Willison’s Weblog Title: LLM Flowbreaking Feedly Summary: LLM Flowbreaking Gadi Evron from Knostic: We propose that LLM Flowbreaking, following jailbreaking and prompt injection, joins as the third on the growing list of LLM attack types. Flowbreaking is less about whether prompt or response guardrails can be bypassed, and more about…

  • Schneier on Security: Race Condition Attacks against LLMs

    Source URL: https://www.schneier.com/blog/archives/2024/11/race-condition-attacks-against-llms.html Source: Schneier on Security Title: Race Condition Attacks against LLMs Feedly Summary: These are two attacks against the system components surrounding LLMs: We propose that LLM Flowbreaking, following jailbreaking and prompt injection, joins as the third on the growing list of LLM attack types. Flowbreaking is less about whether prompt or response…

  • Hacker News: Robot Jailbreak: Researchers Trick Bots into Dangerous Tasks

    Source URL: https://spectrum.ieee.org/jailbreak-llm Source: Hacker News Title: Robot Jailbreak: Researchers Trick Bots into Dangerous Tasks Feedly Summary: Comments AI Summary and Description: Yes **Summary:** The text discusses significant security vulnerabilities associated with large language models (LLMs) used in robotic systems, revealing how easily these systems can be “jailbroken” to perform harmful actions. This raises pressing…