Simon Willison’s Weblog: LLM 0.27, the annotated release notes: GPT-5 and improved tool calling

Aug 12, 2025

—

Source URL: https://simonwillison.net/2025/Aug/11/llm-027/
Source: Simon Willison’s Weblog
Title: LLM 0.27, the annotated release notes: GPT-5 and improved tool calling

Feedly Summary: I shipped LLM 0.27 today, adding support for the new GPT-5 family of models from OpenAI plus a flurry of improvements to the tool calling features introduced in LLM 0.26. Here are the annotated release notes.
GPT-5

New models: gpt-5, gpt-5-mini and gpt-5-nano. #1229

I would have liked to get these out sooner, but LLM had accumulated quite a lot of other changes since the last release and I wanted to use GPT-5 as an excuse to wrap all of those up and get them out there.
These models work much the same as other OpenAI models, but they have a new reasoning_effort option of minimal. You can try that out like this:
llm -m gpt-5 ‘A letter advocating for cozy boxes for pelicans in Half Moon Bay harbor’ -o reasoning_effort minimal

Setting “minimal" almost completely eliminates the "thinking" time for the model, causing it to behave more like GPT-4o.
Here’s the letter it wrote me at a cost of 20 input, 706 output = $0.007085 which is 0.7085 cents.
You can set the default model to GPT-5-mini (since it’s a bit cheaper) like this:
llm models default gpt-5-mini

Tools in templates

LLM templates can now include a list of tools. These can be named tools from plugins or arbitrary Python function blocks, see Tools in templates. #1009

I think this is the most important feature in the new release.
I added LLM’s tool calling features in LLM 0.26. You can call them from the Python API but you can also call them from the command-line like this:
llm -T llm_version -T llm_time ‘Tell the time, then show the version’

Here’s the output of llm logs -c after running that command.
This example shows that you have to explicitly list all of the tools you would like to expose to the model, using the -T/–tool option one or more times.
In LLM 0.27 you can now save these tool collections to a template. Let’s try that now:
llm -T llm_version -T llm_time -m gpt-5 –save mytools

Now mytools is a template that bundles those two tools and sets the default model to GPT-5. We can run it like this:
llm -t mytools ‘Time then version’

Let’s do something more fun. My blog has a Datasette mirror which I can run queries against. I’m going to use the llm-tools-datasette plugin to turn that into a tool-driven template. This plugin uses a "toolbox", which looks a bit like a class. Those are described here.
llm install llm-tools-datasette

# Now create that template
llm –tool ‘Datasette("https://datasette.simonwillison.net/simonwillisonblog")’ \
-m gpt-5 -s ‘Use Datasette tools to answer questions’ –save blog

Now I can ask questions of my database like this:
llm -t blog ‘top ten tags by number of entries

The –td option there stands for –tools-debug – it means we can see all tool calls as they are run.
Here’s the output of the above:
Top 10 tags by number of entries (excluding drafts):
– quora — 1003
– projects — 265
– datasette — 238
– python — 213
– ai — 200
– llms — 200
– generative-ai — 197
– weeknotes — 193
– web-development — 166
– startups — 157

Full transcript with tool traces here.
I’m really excited about the ability to store configured tools

Tools can now return attachments, for models that support features such as image input. #1014

I want to build a tool that can render SVG to an image, then return that image so the model can see what it has drawn. For reasons.

New methods on the Toolbox class: .add_tool(), .prepare() and .prepare_async(), described in Dynamic toolboxes. #1111

I added these because there’s a lot of interest in an MCP plugin for Datasette. Part of the challenge with MCP is that the user provides the URL to a server but we then need to introspect that server and dynamically add the tools we have discovered there. The new .add_tool() method can do that, and the .prepare() and .prepare_async() methods give us a reliable way to run some discovery code outside of the class constructor, allowing it to make asynchronous calls if necessary.

New model.conversation(before_call=x, after_call=y) attributes for registering callback functions to run before and after tool calls. See tool debugging hooks for details. #1088

Raising llm.CancelToolCall now only cancels the current tool call, passing an error back to the model and allowing it to continue. #1148

These hooks are useful for implementing more complex tool calling at the Python API layer. In addition to debugging and logging they allow Python code to intercept tool calls and cancel or delay them based on what they are trying to do.

Some model providers can serve different models from the same configured URL – llm-llama-server for example. Plugins for these providers can now record the resolved model ID of the model that was used to the LLM logs using the response.set_resolved_model(model_id) method. #1117

This solves a frustration I’ve had for a while where some of my plugins log the same model ID for requests that were processed by a bunch of different models under the hood – making my logs less valuable. The new mechanism now allows plugins to record a more accurate model ID for a prompt, should it differ from the model ID that was requsted.

New -l/–latest option for llm logs -q searchterm for searching logs ordered by date (most recent first) instead of the default relevance search. #1177

My personal log database has grown to over 8,000 entries now, and running full-text search queries against it often returned results from last year that were no longer relevant to me. Being able to find the latest prompt matching "pelican svg" is much more useful.
Everything else was bug fixes and documentation improvements:

Bug fixes and documentation

The register_embedding_models hook is now documented. #1049

Show visible stack trace for llm templates show invalid-template-name. #1053

Handle invalid tool names more gracefully in llm chat. #1104

Add a Tool plugins section to the plugin directory. #1110

Error on register(Klass) if the passed class is not a subclass of Toolbox. #1114

Add -h for –help for all llm CLI commands. #1134

Add missing dataclasses to advanced model plugins docs. #1137

Fixed a bug where llm logs -T llm_version "version" –async incorrectly recorded just one single log entry when it should have recorded two. #1150

All extra OpenAI model keys in extra-openai-models.yaml are now documented. #1228

Tags: projects, ai, datasette, annotated-release-notes, generative-ai, llms, llm, llm-tool-use, gpt-5

AI Summary and Description: Yes

Summary: The text discusses the release of LLM version 0.27, which includes support for OpenAI’s new GPT-5 family of models and various enhancements related to tool calling features. This release signifies a leap in the capabilities of large language models by introducing new options and optimizations, relevant for AI security and generative AI applications.

Detailed Description:
The provided text outlines the recent updates in the LLM (Large Language Model) version 0.27, particularly focusing on the integration of OpenAI’s GPT-5 models. This update highlights significant improvements and new features, which are crucial for professionals working in AI, cloud, and infrastructure security domains. Key points include:

– **Release of New Models**:
– Introduction of three new models: `gpt-5`, `gpt-5-mini`, and `gpt-5-nano`.
– New `reasoning_effort` option allows for minimal reasoning time, impacting performance dynamics.

– **Enhancements to Tool Calling Features**:
– The update enhances the previous features introduced in version 0.26, allowing users to call tools directly through both the Python API and command-line interface.
– Users can save configurations of tools into templates, improving usability and organization of toolsets.

– **Tool and Template Functionality**:
– New tools can return attachments, notably the capability to render SVG images as part of model input.
– Dynamic addition of tools via new methods (`.add_tool()`, `.prepare()`, and `.prepare_async()`), facilitating responsiveness to server configurations.

– **Improved Debugging and Logging**:
– New attributes for managing callback functions before and after tool calls allow for greater control and utility during execution.
– Enhanced logging mechanisms provide accurate model IDs for prompts, improving transparency during model interactions.

– **User Experience Improvements**:
– A new option for ordering log outputs by date increases practicality for users querying extensive databases.
– Numerous bug fixes ensure reliability and improve the overall quality of the toolset.

– **General Enhancements**:
– Inclusion of documentation improvements and better error handling has also been implemented, contributing to an overall more robust framework.

For professionals in AI and cloud security, these updates not only provide practical enhancements but also raise considerations regarding the security implications of tooling at scale. As AI capabilities expand with tools like the new GPT-5 models, the need for robust security practices becomes paramount to address potential vulnerabilities in deployment and usage contexts.

-4o .NET 01 1 10 2 2025 3 4 5 5 model 5 models 53 7 a Act actions advanced after age AGI AI AI applications AI capabilities ai model AI models AI security All and annotated API app Application applications Arch art as async asynchronous at ated attribute based being Bi Box Bug bug fixes by C callback function calling capabilities capability challenge chat CI CIA class Cloud cloud security co code Col command command-line interface Configuration configurations Context control conversation cost Current D data database databases dataset datasette day de Debugging default deployment development document documentation domain domains drive driven e end Entry error error handling execution exp experience face fault feature features first fixes for framework full function functionality g Gen general generative Generative AI GIS Go GPT GPT-4o Grace gs H handling Harbor heap high Highlight HR http HTTPS image implications improving in Inclusion infrastructure infrastructure security integration inter interaction interactions interface io IRS ite J Just k Key keys l language language model language models large large language model large language models led Li liability line interface llama llm llms lm log data logging logging mechanisms logs long low M making man mcp mean mini Mir ML Mode model model providers models my N new no notes NPU o of on one only ons open openai OPM opt optimization optimizations options organization ory oS other out output Outputs over pelican per performance performance dynamics plugin plugins point potential practices pre pro process professionals project projects prompt prompts ps Py Python Python code Q quality queries question R Raise raising rate RCE re real reasoning record red release release notes reliability response return Ro robust security robust security practices RoT row RSA Rust s sam Scale search sec security security implications security practices server server configuration side Sig Sim single source SSE stack STAR start startup startups support SVG T Tags: Tails ted templates test text text search the thinking Time times to tool tool calling toolbox tooling tools Toolset Top 10 Tor TP transparency trie turn two UI under up update updates ups US usability usage use user user experience Users V val Valid version vulnerabilities WAN web weeknotes Wi x yaml yt z