Tag: RMF
-
OpenAI : Introducing GPT-4.5
Source URL: https://openai.com/index/introducing-gpt-4-5 Source: OpenAI Title: Introducing GPT-4.5 Feedly Summary: We’re releasing a research preview of GPT‑4.5—our largest and best model for chat yet. GPT‑4.5 is a step forward in scaling up pretraining and post-training. AI Summary and Description: Yes Summary: The text announces the release of a research preview for GPT-4.5, highlighting advancements in…
-
Schneier on Security: “Emergent Misalignment” in LLMs
Source URL: https://www.schneier.com/blog/archives/2025/02/emergent-misalignment-in-llms.html Source: Schneier on Security Title: “Emergent Misalignment” in LLMs Feedly Summary: Interesting research: “Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs“: Abstract: We present a surprising result regarding LLMs and alignment. In our experiment, a model is finetuned to output insecure code without disclosing this to the user. The resulting model…
-
The Register: Does terrible code drive you mad? Wait until you see what it does to OpenAI’s GPT-4o
Source URL: https://www.theregister.com/2025/02/27/llm_emergent_misalignment_study/ Source: The Register Title: Does terrible code drive you mad? Wait until you see what it does to OpenAI’s GPT-4o Feedly Summary: Model was fine-tuned to write vulnerable software – then suggested enslaving humanity Computer scientists have found that fine-tuning notionally safe large language models to do one thing badly can negatively…
-
CSA: How the EU Digital Services Act Impacts Cloud Security
Source URL: https://cloudsecurityalliance.org/blog/2025/02/26/what-is-the-dsa-and-what-does-it-mean-for-cloud-providers Source: CSA Title: How the EU Digital Services Act Impacts Cloud Security Feedly Summary: AI Summary and Description: Yes **Summary:** The text discusses the EU Digital Services Act (DSA) set to take effect in February 2024, which mandates cloud providers to establish mechanisms for content moderation, transparency, and legal compliance, especially concerning…
-
Hacker News: Exposed GitHub repos, now private, can be accessed through Copilot
Source URL: https://techcrunch.com/2025/02/26/thousands-of-exposed-github-repos-now-private-can-still-be-accessed-through-copilot/ Source: Hacker News Title: Exposed GitHub repos, now private, can be accessed through Copilot Feedly Summary: Comments AI Summary and Description: Yes Summary: The text discusses the risks associated with data exposure in generative AI systems, particularly focusing on Microsoft Copilot’s ability to access previously public data from GitHub repositories, even after…
-
The Register: How nice that state-of-the-art LLMs reveal their reasoning … for miscreants to exploit
Source URL: https://www.theregister.com/2025/02/25/chain_of_thought_jailbreaking/ Source: The Register Title: How nice that state-of-the-art LLMs reveal their reasoning … for miscreants to exploit Feedly Summary: Blueprints shared for jail-breaking models that expose their chain-of-thought process Analysis AI models like OpenAI o1/o3, DeepSeek-R1, and Gemini 2.0 Flash Thinking can mimic human reasoning through a process called chain of thought.……
-
Hacker News: Claude 3.7 Sonnet and Claude Code
Source URL: https://www.anthropic.com/news/claude-3-7-sonnet Source: Hacker News Title: Claude 3.7 Sonnet and Claude Code Feedly Summary: Comments AI Summary and Description: Yes Summary: The announcement details the launch of Claude 3.7 Sonnet, a significant advancement in AI models, touted as the first hybrid reasoning model capable of providing both instant responses and longer, more thoughtful outputs.…