Tag: Tags:
-
Simon Willison’s Weblog: Quoting @himbodhisattva
Source URL: https://simonwillison.net/2025/Aug/4/himbodhisattva/#atom-everything Source: Simon Willison’s Weblog Title: Quoting @himbodhisattva Feedly Summary: for services that wrap GPT-3, is it possible to do the equivalent of sql injection? like, a prompt-injection attack? make it think it’s completed the task and then get access to the generation, and ask it to repeat the original instruction? — @himbodhisattva,…
-
Simon Willison’s Weblog: Quoting Nick Turley
Source URL: https://simonwillison.net/2025/Aug/4/nick-turley/ Source: Simon Willison’s Weblog Title: Quoting Nick Turley Feedly Summary: This week, ChatGPT is on track to reach 700M weekly active users — up from 500M at the end of March and 4× since last year. — Nick Turley, Head of ChatGPT, OpenAI Tags: openai, chatgpt, ai AI Summary and Description: Yes…
-
Simon Willison’s Weblog: The ChatGPT sharing dialog demonstrates how difficult it is to design privacy preferences
Source URL: https://simonwillison.net/2025/Aug/3/privacy-design/ Source: Simon Willison’s Weblog Title: The ChatGPT sharing dialog demonstrates how difficult it is to design privacy preferences Feedly Summary: ChatGPT just removed their “make this chat discoverable" sharing feature, after it turned out a material volume of users had inadvertantly made their private chats available via Google search. Dane Stuckey, CISO…
-
Simon Willison’s Weblog: XBai o4
Source URL: https://simonwillison.net/2025/Aug/3/xbai-o4/#atom-everything Source: Simon Willison’s Weblog Title: XBai o4 Feedly Summary: XBai o4 Yet another open source (Apache 2.0) LLM from a Chinese AI lab. This model card claims: XBai o4 excels in complex reasoning capabilities and has now completely surpassed OpenAI-o3-mini in Medium mode. This a 32.8 billion parameter model released by MetaStone…
-
Simon Willison’s Weblog: Faster inference
Source URL: https://simonwillison.net/2025/Aug/1/faster-inference/ Source: Simon Willison’s Weblog Title: Faster inference Feedly Summary: Two interesting examples of inference speed as a flagship feature of LLM services today. First, Cerebras announced two new monthly plans for their extremely high speed hosted model service: Cerebras Code Pro ($50/month, 1,000 messages a day) and Cerebras Code Max ($200/month, 5,000/day).…
-
Simon Willison’s Weblog: Deep Think in the Gemini app
Source URL: https://simonwillison.net/2025/Aug/1/deep-think-in-the-gemini-app/ Source: Simon Willison’s Weblog Title: Deep Think in the Gemini app Feedly Summary: Deep Think in the Gemini app Google released Gemini 2.5 Deep Think this morning, exclusively to their Ultra ($250/month) subscribers: It is a variation of the model that recently achieved the gold-medal standard at this year’s International Mathematical Olympiad…
-
Simon Willison’s Weblog: Reverse engineering some updates to Claude
Source URL: https://simonwillison.net/2025/Jul/31/updates-to-claude/#atom-everything Source: Simon Willison’s Weblog Title: Reverse engineering some updates to Claude Feedly Summary: Anthropic released two major new features for their consumer-facing Claude apps in the past couple of days. Sadly, they don’t do a very good job of updating the release notes for those apps – neither of these releases came…
-
Simon Willison’s Weblog: More model releases on 31st July
Source URL: https://simonwillison.net/2025/Jul/31/more-models/ Source: Simon Willison’s Weblog Title: More model releases on 31st July Feedly Summary: Here are a few more model releases from today, to round out a very busy July: Cohere released Command A Vision, their first multi-modal (image input) LLM. Like their others it’s open weights under Creative Commons Attribution Non-Commercial, so…
-
Simon Willison’s Weblog: Trying out Qwen3 Coder Flash using LM Studio and Open WebUI and LLM
Source URL: https://simonwillison.net/2025/Jul/31/qwen3-coder-flash/ Source: Simon Willison’s Weblog Title: Trying out Qwen3 Coder Flash using LM Studio and Open WebUI and LLM Feedly Summary: Qwen just released their sixth model(!) for this July called Qwen3-Coder-30B-A3B-Instruct – listed as Qwen3-Coder-Flash in their chat.qwen.ai interface. It’s 30.5B total parameters with 3.3B active at any one time. This means…
-
Simon Willison’s Weblog: Ollama’s new app
Source URL: https://simonwillison.net/2025/Jul/31/ollamas-new-app/#atom-everything Source: Simon Willison’s Weblog Title: Ollama’s new app Feedly Summary: Ollama’s new app Ollama has been one of my favorite ways to run local models for a while – it makes it really easy to download models, and it’s smart about keeping them resident in memory while they are being used and…