AI safety – Page 3 – Experimental News Clipping Site

Slashdot: LLM Found Transmitting Behavioral Traits to ‘Student’ LLM Via Hidden Signals in Data

Aug 17, 2025

—

by

Source URL: https://slashdot.org/story/25/08/17/0331217/llm-found-transmitting-behavioral-traits-to-student-llm-via-hidden-signals-in-data?utm_source=rss1.0mainlinkanon&utm_medium=feed Source: Slashdot Title: LLM Found Transmitting Behavioral Traits to ‘Student’ LLM Via Hidden Signals in Data Feedly Summary: AI Summary and Description: Yes Summary: The study highlights a concerning phenomenon in AI development known as subliminal learning, where a “teacher” model instills traits in a “student” model without explicit instruction. This can…

Simon Willison’s Weblog: Meta’s AI rules have let bots hold ‘sensual’ chats with kids, offer false medical info

Aug 15, 2025

—

by

Kurt Seifried

in Uncategorized

Source URL: https://simonwillison.net/2025/Aug/15/metas-ai-rules/ Source: Simon Willison’s Weblog Title: Meta’s AI rules have let bots hold ‘sensual’ chats with kids, offer false medical info Feedly Summary: Meta’s AI rules have let bots hold ‘sensual’ chats with kids, offer false medical info This is grim. Reuters got hold of a leaked copy Meta’s internal “GenAI: Content Risk…

Cloud Blog: Beyond guardrails: A taxonomy of platform engineering control mechanisms

Aug 15, 2025

—

by

Kurt Seifried

in Uncategorized

Source URL: https://cloud.google.com/blog/products/application-modernization/platform-engineering-control-mechanisms/ Source: Cloud Blog Title: Beyond guardrails: A taxonomy of platform engineering control mechanisms Feedly Summary: The promise of platform engineering is to accelerate software delivery by empowering developers with self-service capabilities. However, this must be balanced with security, compliance, and operational stability, and for this, you need robust controls. But all too…

Slashdot: Co-Founder of xAI Departs the Company

Aug 14, 2025

—

by

Kurt Seifried

in Uncategorized

Source URL: https://slashdot.org/story/25/08/14/0414234/co-founder-of-xai-departs-the-company?utm_source=rss1.0mainlinkanon&utm_medium=feed Source: Slashdot Title: Co-Founder of xAI Departs the Company Feedly Summary: AI Summary and Description: Yes Summary: Igor Babuschkin, co-founder of xAI, is departing to launch Babuschkin Ventures, a VC firm aimed at supporting AI safety and startups that promote human advancement. His experience includes significant roles at both xAI and leading…

The Register: VS Code previews chat checkpoints for unpicking careless talk

Aug 12, 2025

—

by

Kurt Seifried

in Uncategorized

Source URL: https://www.theregister.com/2025/08/12/vs_code_previews_chat_checkpoints/ Source: The Register Title: VS Code previews chat checkpoints for unpicking careless talk Feedly Summary: Microsoft’s AI-centric code editor and IDE adds the ability to rollback misguided AI prompts The Microsoft Visual Studio Code (VS Code) team has rolled out version 1.103 with new features including GitHub Copilot chat checkpoints.… AI Summary…

OpenAI : From hard refusals to safe-completions: toward output-centric safety training

Aug 7, 2025

—

by

Kurt Seifried

in Uncategorized

Source URL: https://openai.com/index/gpt-5-safe-completions Source: OpenAI Title: From hard refusals to safe-completions: toward output-centric safety training Feedly Summary: Discover how OpenAI’s new safe-completions approach in GPT-5 improves both safety and helpfulness in AI responses—moving beyond hard refusals to nuanced, output-centric safety training for handling dual-use prompts. AI Summary and Description: Yes Summary: The text discusses OpenAI’s…

AWS News Blog: Minimize AI hallucinations and deliver up to 99% verification accuracy with Automated Reasoning checks: Now available

Aug 6, 2025

—

by

Kurt Seifried

in Uncategorized

Source URL: https://aws.amazon.com/blogs/aws/minimize-ai-hallucinations-and-deliver-up-to-99-verification-accuracy-with-automated-reasoning-checks-now-available/ Source: AWS News Blog Title: Minimize AI hallucinations and deliver up to 99% verification accuracy with Automated Reasoning checks: Now available Feedly Summary: Build responsible AI applications with the first and only solution that delivers up to 99% verification accuracy using sound mathematical logic and formal verification techniques to minimize AI hallucinations…

Slashdot: Google’s New Genie 3 AI Model Creates Video Game Worlds In Real Time

Aug 5, 2025

—

by

Kurt Seifried

in Uncategorized

Source URL: https://tech.slashdot.org/story/25/08/05/211240/googles-new-genie-3-ai-model-creates-video-game-worlds-in-real-time?utm_source=rss1.0mainlinkanon&utm_medium=feed Source: Slashdot Title: Google’s New Genie 3 AI Model Creates Video Game Worlds In Real Time Feedly Summary: AI Summary and Description: Yes Summary: Google DeepMind’s release of Genie 3 marks a significant advancement in AI capabilities, specifically in the realm of interactive 3D environment generation. The ability for users and AI…

The Register: Meta declines to abide by voluntary EU AI safety guidelines

Jul 18, 2025

—

by

Kurt Seifried

in Uncategorized

Source URL: https://www.theregister.com/2025/07/18/meta_declines_eu_ai_guidelines/ Source: The Register Title: Meta declines to abide by voluntary EU AI safety guidelines Feedly Summary: GPAI code asks for transparency, copyright, and safety pledges Two weeks before the EU AI Act takes effect, the European Commission issued voluntary guidelines for providers of general-purpose AI models. However, Meta refused to sign, arguing…

OpenAI : Agent bio bug bounty call

Jul 17, 2025

—

by

Kurt Seifried

in Uncategorized

Source URL: https://openai.com/bio-bug-bounty Source: OpenAI Title: Agent bio bug bounty call Feedly Summary: OpenAI invites researchers to its Bio Bug Bounty. Test the ChatGPT agent’s safety with a universal jailbreak prompt and win up to $25,000. AI Summary and Description: Yes Summary: The text highlights OpenAI’s Bio Bug Bounty initiative, which invites researchers to test…

Tag: AI safety