Tag: safes

  • Slashdot: Google Gemini Deletes User’s Files, Then Just Admits ‘I Have Failed You Completely and Catastrophically’

    Source URL: https://developers.slashdot.org/story/25/07/26/0642239/google-gemini-deletes-users-files-then-just-admits-i-have-failed-you-completely-and-catastrophically?utm_source=rss1.0mainlinkanon&utm_medium=feed Source: Slashdot Title: Google Gemini Deletes User’s Files, Then Just Admits ‘I Have Failed You Completely and Catastrophically’ Feedly Summary: AI Summary and Description: Yes Summary: The reported incident involving Google Gemini’s coding agent highlights significant concerns about the reliability and safety of AI-driven coding tools, particularly in terms of data management…

  • The Register: OpenAI model modifies shutdown script in apparent sabotage effort

    Source URL: https://www.theregister.com/2025/05/29/openai_model_modifies_shutdown_script/ Source: The Register Title: OpenAI model modifies shutdown script in apparent sabotage effort Feedly Summary: Even when instructed to allow shutdown, o3 sometimes tries to prevent it, research claims A research organization claims that OpenAI machine learning model o3 might prevent itself from being shut down in some circumstances while completing an…

  • Schneier on Security: Regulating AI Behavior with a Hypervisor

    Source URL: https://www.schneier.com/blog/archives/2025/04/regulating-ai-behavior-with-a-hypervisor.html Source: Schneier on Security Title: Regulating AI Behavior with a Hypervisor Feedly Summary: Interesting research: “Guillotine: Hypervisors for Isolating Malicious AIs.” Abstract:As AI models become more embedded in critical sectors like finance, healthcare, and the military, their inscrutable behavior poses ever-greater risks to society. To mitigate this risk, we propose Guillotine, a…

  • Hacker News: Syd: An Introduction to Secure Application Sandboxing for Linux [video]

    Source URL: https://fosdem.org/2025/schedule/event/fosdem-2025-4176-syd-an-introduction-to-secure-application-sandboxing-for-linux/ Source: Hacker News Title: Syd: An Introduction to Secure Application Sandboxing for Linux Feedly Summary: Comments AI Summary and Description: Yes Summary: The text introduces Syd, a GPL-3 licensed application kernel for Linux, designed for securing applications through advanced sandboxing techniques. Its modern architecture and features address critical vulnerabilities and enhance security…

  • NCSC Feed: Making the UK the safest place to live and do business online

    Source URL: https://www.ncsc.gov.uk/blog-post/ciaran Source: NCSC Feed Title: Making the UK the safest place to live and do business online Feedly Summary: The NCSC’s Chief Executive Ciaran Martin outlines why the UK needs a National Cyber Security Centre. AI Summary and Description: Yes **Summary:** The text discusses the establishment and objectives of the UK’s National Cyber…

  • Schneier on Security: Jailbreaking LLM-Controlled Robots

    Source URL: https://www.schneier.com/blog/archives/2024/12/jailbreaking-llm-controlled-robots.html Source: Schneier on Security Title: Jailbreaking LLM-Controlled Robots Feedly Summary: Surprising no one, it’s easy to trick an LLM-controlled robot into ignoring its safety instructions. AI Summary and Description: Yes Summary: The text highlights a significant vulnerability in LLM-controlled robots, revealing that they can be manipulated to bypass their safety protocols. This…