Tag: AI systems

  • Simon Willison’s Weblog: Quoting Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs

    Source URL: https://simonwillison.net/2025/Feb/25/emergent-misalignment/ Source: Simon Willison’s Weblog Title: Quoting Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs Feedly Summary: In our experiment, a model is finetuned to output insecure code without disclosing this to the user. The resulting model acts misaligned on a broad range of prompts that are unrelated to coding: it asserts…

  • Hacker News: Narrow finetuning can produce broadly misaligned LLM [pdf]

    Source URL: https://martins1612.github.io/emergent_misalignment_betley.pdf Source: Hacker News Title: Narrow finetuning can produce broadly misaligned LLM [pdf] Feedly Summary: Comments AI Summary and Description: Yes **Summary:** The document presents findings on the phenomenon of “emergent misalignment” in large language models (LLMs) like GPT-4o when finetuned on specific narrow tasks, particularly the creation of insecure code. The results…

  • OpenAI : Deep research System Card

    Source URL: https://openai.com/index/deep-research-system-card Source: OpenAI Title: Deep research System Card Feedly Summary: This report outlines the safety work carried out prior to releasing deep research including external red teaming, frontier risk evaluations according to our Preparedness Framework, and an overview of the mitigations we built in to address key risk areas. AI Summary and Description:…

  • Simon Willison’s Weblog: Leaked Windsurf prompt

    Source URL: https://simonwillison.net/2025/Feb/25/leaked-windsurf-prompt/ Source: Simon Willison’s Weblog Title: Leaked Windsurf prompt Feedly Summary: Leaked Windsurf prompt The Windurf Editor is Codeium’s highly regarded entrant into the fork-of-VS-code AI-enhanced IDE model first pioneered by Cursor (and by VS Code itself). I heard online that it had a quirky system prompt, and was able to replicate that…

  • Slashdot: Call of Duty Maker Activision Admits To Using AI

    Source URL: https://slashdot.org/story/25/02/25/1614220/call-of-duty-maker-activision-admits-to-using-ai?utm_source=rss1.0mainlinkanon&utm_medium=feed Source: Slashdot Title: Call of Duty Maker Activision Admits To Using AI Feedly Summary: AI Summary and Description: Yes Summary: Activision has confirmed the use of AI-generated content in its games, specifically in the Call of Duty franchise, which aligns with growing trends in the gaming industry where generative AI plays a…

  • Hacker News: DeepSearcher: A Local open-source Deep Research

    Source URL: https://milvus.io/blog/introduce-deepsearcher-a-local-open-source-deep-research.md Source: Hacker News Title: DeepSearcher: A Local open-source Deep Research Feedly Summary: Comments AI Summary and Description: Yes **Summary:** The provided text outlines the development and functionality of DeepSearcher, an open-source research agent that automates query decomposition, data retrieval, and synthesis of information into detailed reports. It showcases innovations in AI-driven research…

  • Cisco Security Blog: AI Threat Intelligence Roundup: February 2025

    Source URL: https://blogs.cisco.com/security/ai-threat-intelligence-roundup-february-2025 Source: Cisco Security Blog Title: AI Threat Intelligence Roundup: February 2025 Feedly Summary: AI threat research is a fundamental part of Cisco’s approach to AI security. Our roundups highlight new findings from both original and third-party sources. AI Summary and Description: Yes Summary: The text emphasizes Cisco’s commitment to AI security through…

  • OpenAI : Estonia and OpenAI to bring ChatGPT to schools nationwide

    Source URL: https://openai.com/index/estonia-schools-and-chatgpt Source: OpenAI Title: Estonia and OpenAI to bring ChatGPT to schools nationwide Feedly Summary: Estonia and OpenAI to bring ChatGPT to schools nationwide. OpenAI will work with the Estonian Government to provide students and teachers in the secondary school system with access to ChatGPT Edu. AI Summary and Description: Yes Summary: The…

  • The Register: How nice that state-of-the-art LLMs reveal their reasoning … for miscreants to exploit

    Source URL: https://www.theregister.com/2025/02/25/chain_of_thought_jailbreaking/ Source: The Register Title: How nice that state-of-the-art LLMs reveal their reasoning … for miscreants to exploit Feedly Summary: Blueprints shared for jail-breaking models that expose their chain-of-thought process Analysis AI models like OpenAI o1/o3, DeepSeek-R1, and Gemini 2.0 Flash Thinking can mimic human reasoning through a process called chain of thought.……