Tag: AI safety

  • Slashdot: A ‘Godfather of AI’ Remains Concerned as Ever About Human Extinction

    Source URL: https://slashdot.org/story/25/10/01/1422204/a-godfather-of-ai-remains-concerned-as-ever-about-human-extinction?utm_source=rss1.0mainlinkanon&utm_medium=feed Source: Slashdot Title: A ‘Godfather of AI’ Remains Concerned as Ever About Human Extinction Feedly Summary: AI Summary and Description: Yes Summary: The text discusses Yoshua Bengio’s call for a pause in AI model development to prioritize safety standards, emphasizing the significant risks posed by advanced AI. Despite major investments in AI…

  • The Register: AI gone rogue: Models may try to stop people from shutting them down, Google warns

    Source URL: https://www.theregister.com/2025/09/22/google_ai_misalignment_risk/ Source: The Register Title: AI gone rogue: Models may try to stop people from shutting them down, Google warns Feedly Summary: Misalignment risk? That’s an area for future study Google DeepMind added a new AI threat scenario – one where a model might try to prevent its operators from modifying it or…

  • Schneier on Security: Time-of-Check Time-of-Use Attacks Against LLMs

    Source URL: https://www.schneier.com/blog/archives/2025/09/time-of-check-time-of-use-attacks-against-llms.html Source: Schneier on Security Title: Time-of-Check Time-of-Use Attacks Against LLMs Feedly Summary: This is a nice piece of research: “Mind the Gap: Time-of-Check to Time-of-Use Vulnerabilities in LLM-Enabled Agents“.: Abstract: Large Language Model (LLM)-enabled agents are rapidly emerging across a wide range of applications, but their deployment introduces vulnerabilities with security implications.…

  • OpenAI : Detecting and reducing scheming in AI models

    Source URL: https://openai.com/index/detecting-and-reducing-scheming-in-ai-models Source: OpenAI Title: Detecting and reducing scheming in AI models Feedly Summary: Apollo Research and OpenAI developed evaluations for hidden misalignment (“scheming”) and found behaviors consistent with scheming in controlled tests across frontier models. The team shared concrete examples and stress tests of an early method to reduce scheming. AI Summary and…

  • OpenAI : Working with US CAISI and UK AISI to build more secure AI systems

    Source URL: https://openai.com/index/us-caisi-uk-aisi-ai-update Source: OpenAI Title: Working with US CAISI and UK AISI to build more secure AI systems Feedly Summary: OpenAI shares progress on the partnership with the US CAISI and UK AISI to strengthen AI safety and security. The collaboration is setting new standards for responsible frontier AI deployment through joint red-teaming, biosecurity…

  • OpenAI : Working with US CAISI and UK AISI to build more secure AI systems

    Source URL: https://openai.com/index/us-caisi-uk-aisi-ai-safety Source: OpenAI Title: Working with US CAISI and UK AISI to build more secure AI systems Feedly Summary: OpenAI shares progress on the partnership with the US CAISI and UK AISI to strengthen AI safety and security. The collaboration is setting new standards for responsible frontier AI deployment through joint red-teaming, biosecurity…