Simon Willison’s Weblog: LLM 0.22, the annotated release notes

Source URL: https://simonwillison.net/2025/Feb/17/llm/#atom-everything
Source: Simon Willison’s Weblog
Title: LLM 0.22, the annotated release notes

Feedly Summary: I released LLM 0.22 this evening. Here are the annotated release notes:

model.prompt(…, key=) for API keys
chatgpt-4o-latest
llm logs -s/–short
llm models -q gemini -q exp
llm embed-multi –prepend X
Everything else

model.prompt(…, key=) for API keys

Plugins that provide models that use API keys can now subclass the new llm.KeyModel and llm.AsyncKeyModel classes. This results in the API key being passed as a new key parameter to their .execute() methods, and means that Python users can pass a key as the model.prompt(…, key=) – see Passing an API key. Plugin developers should consult the new documentation on writing Models that accept API keys. #744

This is the big change. It’s only relevant to you if you use LLM as a Python library and you need the ability to pass API keys for OpenAI, Anthropic, Gemini etc in yourself in Python code rather than setting them as an environment variable.
It turns out I need to do that for Datasette Cloud, where API keys are retrieved from individual customer’s secret stores!
Thanks to this change, it’s now possible to do things like this – the key= parameter to model.prompt() is new:
import llm
model = llm.get_model(“gpt-4o-turbo")
response = model.prompt("Surprise me!", key="my-api-key")
print(response.text())
Other plugins need to be updated to take advantage of this new feature. Here’s the documentation for plugin developers – I’ve released llm-anthropic 0.13 and llm-gemini 0.11 implementing the new pattern.
chatgpt-4o-latest

New OpenAI model: chatgpt-4o-latest. This model ID accesses the current model being used to power ChatGPT, which can change without warning. #752

This model has actually been around since August 2024 but I had somehow missed it. chatgpt-4o-latest is a model alias that provides access to the current model that is being used for GPT-4o running on ChatGPT, which is not the same as the GPT-4o models usually available via the API. It got an upgrade last week so it’s currently the alias that provides access to the most recently released OpenAI model.
Most OpenAI models such as gpt-4o provide stable date-based aliases like gpt-4o-2024-08-06 which effectively let you "pin" to that exact model version. OpenAI technical staff have confirmed that they don’t change the model without updating that name.
The one exception is chatgpt-4o-latest – that one can change without warning and doesn’t appear to have release notes at all.
It’s also a little more expensive that gpt-4o – currently priced at $5/million tokens for input and $15/million for output, compared to GPT 4o’s $2.50/$10.
It’s a fun model to play with though! As of last week it appears to be very chatty and keen on using emoji. It also claims that it has a July 2024 training cut-off.
llm logs -s/–short

New llm logs -s/–short flag, which returns a greatly shortened version of the matching log entries in YAML format with a truncated prompt and without including the response. #737

The llm logs command lets you search through logged prompt-response pairs – I have 4,419 of them in my database, according to this command:
sqlite-utils tables "$(llm logs path)" –counts | grep responses
By default it outputs the full prompts and responses as Markdown – and since I’ve started leaning more into long context models (some recent examples) my logs have been getting pretty hard to navigate.
The new -s/–short flag provides a much more concise YAML format. Here are some of my recent prompts that I’ve run using Google’s Gemini 2.0 Pro experimental model – the -u flag includes usage statistics, and -n 4 limits the output to the most recent 4 entries:
llm logs –short -m gemini-2.0-pro-exp-02-05 -u -n 4
– model: gemini-2.0-pro-exp-02-05
datetime: ‘2025-02-13T22:30:48’
conversation: 01jm0q045fqp5xy5pn4j1bfbxs
prompt: ‘ <document index="1"> <source>./index.md</source> <document_content>
# uv An extremely fast Python package…’
usage:
input: 281812
output: 1521
– model: gemini-2.0-pro-exp-02-05
datetime: ‘2025-02-13T22:32:29’
conversation: 01jm0q045fqp5xy5pn4j1bfbxs
prompt: I want to set it globally so if I run uv run python anywhere on my computer
I always get 3.13
usage:
input: 283369
output: 1540
– model: gemini-2.0-pro-exp-02-05
datetime: ‘2025-02-14T23:23:57’
conversation: 01jm3cek8eb4z8tkqhf4trk98b
prompt: ‘<documents> <document index="1"> <source>./LORA.md</source> <document_content>
# Fine-Tuning with LoRA or QLoRA You c…’
usage:
input: 162885
output: 2558
– model: gemini-2.0-pro-exp-02-05
datetime: ‘2025-02-14T23:30:13’
conversation: 01jm3csstrfygp35rk0y1w3rfc
prompt: ‘<documents> <document index="1"> <source>huggingface_hub/__init__.py</source>
<document_content> # Copyright 2020 The…’
usage:
input: 480216
output: 1791
llm models -q gemini -q exp

Both llm models and llm embed-models now take multiple -q search fragments. You can now search for all models matching "gemini" and "exp" using llm models -q gemini -q exp. #748

I have over 100 models installed in LLM now across a bunch of different plugins. I added the -q option to help search through them a few months ago, and now I’ve upgraded it so you can pass it multiple times.
Want to see all the Gemini experimental models?
llm models -q gemini -q exp
Outputs:
GeminiPro: gemini-exp-1114
GeminiPro: gemini-exp-1121
GeminiPro: gemini-exp-1206
GeminiPro: gemini-2.0-flash-exp
GeminiPro: learnlm-1.5-pro-experimental
GeminiPro: gemini-2.0-flash-thinking-exp-1219
GeminiPro: gemini-2.0-flash-thinking-exp-01-21
GeminiPro: gemini-2.0-pro-exp-02-05 (aliases: g2)

For consistency I added the same options to the llm embed-models command, which lists available embedding models.
llm embed-multi –prepend X

New llm embed-multi –prepend X option for prepending a string to each value before it is embedded – useful for models such as nomic-embed-text-v2-moe that require passages to start with a string like "search_document: ". #745

This was inspired by my initial experiments with Nomic Embed Text V2 last week.
Everything else

The response.json() and response.usage() methods are now documented.

Someone asked a question aboutthese methods online, which made me realize they weren’t documented. I enjoy promptly turning questions like this into documentation!

Fixed a bug where conversations that were loaded from the database could not be continued using asyncio prompts. #742

This bug was reported by Romain Gehrig. It turned out not to be possible to execute a follow-up prompt in async mode if the previous conversation had been loaded from the database.
% llm ‘hi’ –async
Hello! How can I assist you today?
% llm ‘now in french’ –async -c
Error: ‘async for’ requires an object with __aiter__ method, got Response

I fixed the bug for the moment, but I’d like to make the whole mechanism of persisting and loading conversations from SQLite part of the documented and supported Python API – it’s currently tucked away in CLI-specific internals which aren’t safe for people to use in their own code.

The llm-claude-3 plugin has been renamed to llm-anthropic.

I wrote about this previously when I announced llm-anthropic. The new name prepares me for a world in which Anthropic release models that aren’t called Claude 3 or Claude 3.5!
Tags: projects, ai, annotated-release-notes, openai, generative-ai, chatgpt, llms, llm, anthropic, gemini

AI Summary and Description: Yes

Summary: The release notes for LLM 0.22 provide significant updates related to API key management, model access for OpenAI’s chatgpt-4o-latest, and enhancements to logging and searching features within the library. These updates are particularly relevant for developers using LLM as a Python library, emphasizing security by allowing better management of sensitive API keys.

Detailed Description:
The release notes detail several important changes in LLM 0.22, focusing on functionality enhancements that improve usability and security for developers. Key updates include:

– **API Key Management**: Introduction of `model.prompt(…, key=)` parameter allows users to pass API keys directly in code rather than relying on environment variables. This approach increases security by potentially facilitating customized key retrieval from customer-specific secret stores.

– **New Model Alias**:
– The `chatgpt-4o-latest` model ID now provides access to the latest version of OpenAI’s model used in ChatGPT without needing to update version numbers manually. This model can change without warning, emphasizing the need for careful monitoring of API usage and costs associated with it.

– **Logging Improvements**:
– The introduction of `llm logs -s/–short` flag significantly streamlines the log output, presenting it in a concise YAML format. This feature aids developers in effectively managing logs, a critical aspect in debugging and monitoring application performance.

– **Search Functionality**:
– The ability to use multiple query fragments with `llm models -q` enhances the search capabilities for available models, making it easier to filter through large sets of models.

– **Embedding Models**:
– New command `llm embed-multi –prepend X` allows users to specify strings that must precede embeddings, aiding in the appropriate functioning of some models.

– **Documentation Enhancements**:
– Several previously undocumented features, such as `response.json()` and `response.usage()`, are now explicitly documented, facilitating better developer experience through clarity and support.

– **Bug Fixes and Naming**:
– A bug fix relating to the asynchronous execution of prompts, as well as a renaming of the `llm-claude-3` plugin to `llm-anthropic`, demonstrate ongoing support and improvement in the library.

These updates reflect the continuous evolution of AI library tools, underscoring the importance of security, user-friendliness, and operational efficacy in cloud and AI contexts. Such enhancements should be closely monitored by professionals in AI, cloud security, and software development to leverage these capabilities responsibly and effectively.