Tag: bypass

  • NCSC Feed: From bugs to bypasses: adapting vulnerability disclosure for AI safeguards

    Source URL: https://www.ncsc.gov.uk/blog-post/from-bugs-to-bypasses-adapting-vulnerability-disclosure-for-ai-safeguards Source: NCSC Feed Title: From bugs to bypasses: adapting vulnerability disclosure for AI safeguards Feedly Summary: Exploring how far cyber security approaches can help mitigate risks in generative AI systems AI Summary and Description: Yes Summary: The text addresses the intersection of cybersecurity strategies and generative AI systems, highlighting how established cybersecurity…

  • Embrace The Red: Wrap Up: The Month of AI Bugs

    Source URL: https://embracethered.com/blog/posts/2025/wrapping-up-month-of-ai-bugs/ Source: Embrace The Red Title: Wrap Up: The Month of AI Bugs Feedly Summary: That’s it. The Month of AI Bugs is done. There won’t be a post tomorrow, because I will be at PAX West. Overview of Posts ChatGPT: Exfiltrating Your Chat History and Memories With Prompt Injection | Video ChatGPT…

  • Slashdot: One Long Sentence is All It Takes To Make LLMs Misbehave

    Source URL: https://slashdot.org/story/25/08/27/1756253/one-long-sentence-is-all-it-takes-to-make-llms-misbehave?utm_source=rss1.0mainlinkanon&utm_medium=feed Source: Slashdot Title: One Long Sentence is All It Takes To Make LLMs Misbehave Feedly Summary: AI Summary and Description: Yes Summary: The text discusses a significant security research finding from Palo Alto Networks’ Unit 42 regarding vulnerabilities in large language models (LLMs). The researchers explored methods that allow users to bypass…

  • OpenAI : OpenAI and Anthropic share findings from a joint safety evaluation

    Source URL: https://openai.com/index/openai-anthropic-safety-evaluation Source: OpenAI Title: OpenAI and Anthropic share findings from a joint safety evaluation Feedly Summary: OpenAI and Anthropic share findings from a first-of-its-kind joint safety evaluation, testing each other’s models for misalignment, instruction following, hallucinations, jailbreaking, and more—highlighting progress, challenges, and the value of cross-lab collaboration. AI Summary and Description: Yes Summary:…

  • The Register: One long sentence is all it takes to make LLMs misbehave

    Source URL: https://www.theregister.com/2025/08/26/breaking_llms_for_fun/ Source: The Register Title: One long sentence is all it takes to make LLMs misbehave Feedly Summary: Chatbots ignore their guardrails when your grammar sucks, researchers find Security researchers from Palo Alto Networks’ Unit 42 have discovered the key to getting large language model (LLM) chatbots to ignore their guardrails, and it’s…

  • The Register: Malware-ridden apps made it into Google’s Play Store, scored 19 million downloads

    Source URL: https://www.theregister.com/2025/08/26/apps_android_malware/ Source: The Register Title: Malware-ridden apps made it into Google’s Play Store, scored 19 million downloads Feedly Summary: Everything’s fine, the ad slinger assures us Cloud security vendor Zscaler says customers of Google’s Play Store have downloaded more than 19 million instances of malware-laden apps that evaded the web giant’s security scans.……

  • Slashdot: Japanese Media Groups Sue AI Search Engine Perplexity Over Alleged Copyright Infringement

    Source URL: https://slashdot.org/story/25/08/26/0553200/japanese-media-groups-sue-ai-search-engine-perplexity-over-alleged-copyright-infringement?utm_source=rss1.0mainlinkanon&utm_medium=feed Source: Slashdot Title: Japanese Media Groups Sue AI Search Engine Perplexity Over Alleged Copyright Infringement Feedly Summary: AI Summary and Description: Yes Summary: Two major Japanese media groups are suing the AI search engine Perplexity for alleged copyright infringement, reflecting a growing trend among news publishers globally to take legal action against…