Tag: AI safety

  • Slashdot: LLM Found Transmitting Behavioral Traits to ‘Student’ LLM Via Hidden Signals in Data

    Source URL: https://slashdot.org/story/25/08/17/0331217/llm-found-transmitting-behavioral-traits-to-student-llm-via-hidden-signals-in-data?utm_source=rss1.0mainlinkanon&utm_medium=feed Source: Slashdot Title: LLM Found Transmitting Behavioral Traits to ‘Student’ LLM Via Hidden Signals in Data Feedly Summary: AI Summary and Description: Yes Summary: The study highlights a concerning phenomenon in AI development known as subliminal learning, where a “teacher” model instills traits in a “student” model without explicit instruction. This can…

  • Slashdot: Co-Founder of xAI Departs the Company

    Source URL: https://slashdot.org/story/25/08/14/0414234/co-founder-of-xai-departs-the-company?utm_source=rss1.0mainlinkanon&utm_medium=feed Source: Slashdot Title: Co-Founder of xAI Departs the Company Feedly Summary: AI Summary and Description: Yes Summary: Igor Babuschkin, co-founder of xAI, is departing to launch Babuschkin Ventures, a VC firm aimed at supporting AI safety and startups that promote human advancement. His experience includes significant roles at both xAI and leading…

  • The Register: VS Code previews chat checkpoints for unpicking careless talk

    Source URL: https://www.theregister.com/2025/08/12/vs_code_previews_chat_checkpoints/ Source: The Register Title: VS Code previews chat checkpoints for unpicking careless talk Feedly Summary: Microsoft’s AI-centric code editor and IDE adds the ability to rollback misguided AI prompts The Microsoft Visual Studio Code (VS Code) team has rolled out version 1.103 with new features including GitHub Copilot chat checkpoints.… AI Summary…

  • OpenAI : From hard refusals to safe-completions: toward output-centric safety training

    Source URL: https://openai.com/index/gpt-5-safe-completions Source: OpenAI Title: From hard refusals to safe-completions: toward output-centric safety training Feedly Summary: Discover how OpenAI’s new safe-completions approach in GPT-5 improves both safety and helpfulness in AI responses—moving beyond hard refusals to nuanced, output-centric safety training for handling dual-use prompts. AI Summary and Description: Yes Summary: The text discusses OpenAI’s…

  • AWS News Blog: Minimize AI hallucinations and deliver up to 99% verification accuracy with Automated Reasoning checks: Now available

    Source URL: https://aws.amazon.com/blogs/aws/minimize-ai-hallucinations-and-deliver-up-to-99-verification-accuracy-with-automated-reasoning-checks-now-available/ Source: AWS News Blog Title: Minimize AI hallucinations and deliver up to 99% verification accuracy with Automated Reasoning checks: Now available Feedly Summary: Build responsible AI applications with the first and only solution that delivers up to 99% verification accuracy using sound mathematical logic and formal verification techniques to minimize AI hallucinations…

  • Slashdot: Google’s New Genie 3 AI Model Creates Video Game Worlds In Real Time

    Source URL: https://tech.slashdot.org/story/25/08/05/211240/googles-new-genie-3-ai-model-creates-video-game-worlds-in-real-time?utm_source=rss1.0mainlinkanon&utm_medium=feed Source: Slashdot Title: Google’s New Genie 3 AI Model Creates Video Game Worlds In Real Time Feedly Summary: AI Summary and Description: Yes Summary: Google DeepMind’s release of Genie 3 marks a significant advancement in AI capabilities, specifically in the realm of interactive 3D environment generation. The ability for users and AI…

  • The Register: Meta declines to abide by voluntary EU AI safety guidelines

    Source URL: https://www.theregister.com/2025/07/18/meta_declines_eu_ai_guidelines/ Source: The Register Title: Meta declines to abide by voluntary EU AI safety guidelines Feedly Summary: GPAI code asks for transparency, copyright, and safety pledges Two weeks before the EU AI Act takes effect, the European Commission issued voluntary guidelines for providers of general-purpose AI models. However, Meta refused to sign, arguing…

  • OpenAI : Agent bio bug bounty call

    Source URL: https://openai.com/bio-bug-bounty Source: OpenAI Title: Agent bio bug bounty call Feedly Summary: OpenAI invites researchers to its Bio Bug Bounty. Test the ChatGPT agent’s safety with a universal jailbreak prompt and win up to $25,000. AI Summary and Description: Yes Summary: The text highlights OpenAI’s Bio Bug Bounty initiative, which invites researchers to test…