Tag: Tags:
-
Simon Willison’s Weblog: Magistral 1.2
Source URL: https://simonwillison.net/2025/Sep/19/magistral/ Source: Simon Willison’s Weblog Title: Magistral 1.2 Feedly Summary: Mistral quietly released two new models yesterday: Magistral Small 1.2 (Apache 2.0, 96.1 GB on Hugging Face) and Magistral Medium 1.2 (not open weights same as Mistral’s other “medium" models.) Despite being described as "minor updates" to the Magistral 1.1 models these have…
-
Simon Willison’s Weblog: The Hidden Risk in Notion 3.0 AI Agents: Web Search Tool Abuse for Data Exfiltration
Source URL: https://simonwillison.net/2025/Sep/19/notion-lethal-trifecta/ Source: Simon Willison’s Weblog Title: The Hidden Risk in Notion 3.0 AI Agents: Web Search Tool Abuse for Data Exfiltration Feedly Summary: The Hidden Risk in Notion 3.0 AI Agents: Web Search Tool Abuse for Data Exfiltration Abi Raghuram reports that Notion 3.0, released yesterday, introduces new prompt injection data exfiltration vulnerabilities…
-
Simon Willison’s Weblog: I think "agent" may finally have a widely enough agreed upon definition to be useful jargon now
Source URL: https://simonwillison.net/2025/Sep/18/agents/ Source: Simon Willison’s Weblog Title: I think "agent" may finally have a widely enough agreed upon definition to be useful jargon now Feedly Summary: I’ve noticed something interesting over the past few weeks: I’ve started using the term “agent" in conversations where I don’t feel the need to then define it, roll…
-
Simon Willison’s Weblog: Anthropic: A postmortem of three recent issues
Source URL: https://simonwillison.net/2025/Sep/17/anthropic-postmortem/ Source: Simon Willison’s Weblog Title: Anthropic: A postmortem of three recent issues Feedly Summary: Anthropic: A postmortem of three recent issues Anthropic had a very bad month in terms of model reliability: Between August and early September, three infrastructure bugs intermittently degraded Claude’s response quality. We’ve now resolved these issues and want…
-
Simon Willison’s Weblog: ICPC medals for OpenAI and Gemini
Source URL: https://simonwillison.net/2025/Sep/17/icpc/#atom-everything Source: Simon Willison’s Weblog Title: ICPC medals for OpenAI and Gemini Feedly Summary: In July it was the International Math Olympiad (OpenAI, Gemini), today it’s the International Collegiate Programming Contest (ICPC). Once again, both OpenAI and Gemini competed with models that achieved Gold medal performance. OpenAI’s Mostafa Rohaninejad: We received the problems…
-
Simon Willison’s Weblog: GPT‑5-Codex and upgrades to Codex
Source URL: https://simonwillison.net/2025/Sep/15/gpt-5-codex/#atom-everything Source: Simon Willison’s Weblog Title: GPT‑5-Codex and upgrades to Codex Feedly Summary: GPT‑5-Codex and upgrades to Codex OpenAI half-released a new model today: GPT‑5-Codex, a fine-tuned GPT-5 variant explicitly designed for their various AI-assisted programming tools. I say half-released because it’s not yet available via their API, but they “plan to make…
-
Simon Willison’s Weblog: Models can prompt now
Source URL: https://simonwillison.net/2025/Sep/14/models-can-prompt/#atom-everything Source: Simon Willison’s Weblog Title: Models can prompt now Feedly Summary: Here’s an interesting example of models incrementally improving over time: I am finding that today’s leading models are competent at writing prompts for themselves and each other. A year ago I was quite skeptical of the pattern where models are used…
-
Simon Willison’s Weblog: gpt-5 and gpt-5-mini rate limit updates
Source URL: https://simonwillison.net/2025/Sep/12/gpt-5-rate-limits/#atom-everything Source: Simon Willison’s Weblog Title: gpt-5 and gpt-5-mini rate limit updates Feedly Summary: gpt-5 and gpt-5-mini rate limit updates OpenAI have increased the rate limits for their two main GPT-5 models. These look significant: gpt-5 Tier 1: 30K → 500K TPM (1.5M batch) Tier 2: 450K → 1M (3M batch) Tier 3:…
-
Simon Willison’s Weblog: Comparing the memory implementations of Claude and ChatGPT
Source URL: https://simonwillison.net/2025/Sep/12/claude-memory/#atom-everything Source: Simon Willison’s Weblog Title: Comparing the memory implementations of Claude and ChatGPT Feedly Summary: Claude Memory: A Different Philosophy Shlok Khemani has been doing excellent work reverse-engineering LLM systems and documenting his discoveries. Last week he wrote about ChatGPT memory. This week it’s Claude. Claude’s memory system has two fundamental characteristics.…
-
Simon Willison’s Weblog: Qwen3-Next-80B-A3B: 🐧🦩 Who needs legs?!
Source URL: https://simonwillison.net/2025/Sep/12/qwen3-next/#atom-everything Source: Simon Willison’s Weblog Title: Qwen3-Next-80B-A3B: 🐧🦩 Who needs legs?! Feedly Summary: Qwen3-Next-80B-A3B Qwen announced two new models via their Twitter account (nothing on their blog yet): Qwen3-Next-80B-A3B-Instruct and Qwen3-Next-80B-A3B-Thinking. They make some big claims on performance: Qwen3-Next-80B-A3B-Instruct approaches our 235B flagship. Qwen3-Next-80B-A3B-Thinking outperforms Gemini-2.5-Flash-Thinking. The name “80B-A3B" indicates 80 billion parameters…