Simon Willison’s Weblog: December in LLMs has been a lot

Dec 20, 2024

—

Source URL: https://simonwillison.net/2024/Dec/20/december-in-llms-has-been-a-lot/#atom-everything
Source: Simon Willison’s Weblog
Title: December in LLMs has been a lot

Feedly Summary: I had big plans for December: for one thing, I was hoping to get to an actual RC of Datasette 1.0, in preparation for a full release in January. Instead, I’ve found myself distracted by a constant barrage of new LLM releases.
On December 4th Amazon introduced the Amazon Nova family of multi-modal models – clearly priced to compete with the excellent and inexpensive Gemini 1.5 series from Google. I got those working with LLM via a new llm-bedrock plugin.
The next big release was Llama 3.3 70B-Instruct, on December 6th. Meta claimed that this 70B model was comparable in quality to their much larger 405B model, and those claims seem to hold weight.
I wrote about how I can now run a GPT-4 class model on my laptop – the same laptop that was running a GPT-3 class model just 20 months ago.
Llama 3.3 70B has started showing up from API providers now, including super-fast hosted versions from both Groq (276 tokens/second) and Cerebras (a quite frankly absurd 2,200 tokens/second). If you haven’t tried Val Town’s Cerebras Coder demo you really should.
I think the huge gains in model efficiency are one of the defining stories of LLMs in 2024. It’s not just the local models that have benefited: the price of proprietary hosted LLMs has dropped through the floor, a result of both competition between vendors and the increasing efficiency of the models themselves.
Last year the running joke was that every time Google put out a new Gemini release OpenAI would ship something more impressive that same day to undermine them.
The tides have turned! This month Google shipped three updates that took the wind out of OpenAI’s sails.
The first was Gemini 2.0 Flash on the 11th of December, the first release in Google’s Gemini 2.0 series. The streaming support was particularly impressive, with https://aistudio.google.com/live demonstrating streaming audio and webcam communication with the multi-modal LLM a full day before OpenAI released their own streaming camera/audio features in an update to ChatGPT.
Then this morning Google shipped Gemini 2.0 Flash “Thinking mode", their version of the inference scaling technique pioneered by OpenAI’s o1. I did not expect Gemini to ship a version of that before 2024 had even ended.
OpenAI have one day left in their 12 Days of OpenAI event. Previous highlights have included the full o1 model (an upgrade from o1-preview) and o1-pro, Sora (later upstaged a week later by Google’s Veo 2), Canvas (with a confusing second way to run Python), Advanced Voice with video streaming and Santa, ChatGPT Projects (pretty much a direct lift of the similar Claude feature) and the 1-800-CHATGPT phone line.
Tomorrow is the last day. I’m not going to try to predict what they’ll launch, but I imagine it will be something notable to close out the year.
Blog entries

Gemini 2.0 Flash "Thinking mode"
Building Python tools with a one-shot prompt using uv run and Claude Projects
Gemini 2.0 Flash: An outstanding multi-modal LLM with a sci-fi streaming mode
ChatGPT Canvas can make API requests now, but it’s complicated
I can now run a GPT-4 class model on my laptop
Prompts.js
First impressions of the new Amazon Nova LLMs (via a new llm-bedrock plugin)
Storing times for human events
Ask questions of SQLite databases and CSV/JSON files in your terminal

Releases

llm-gemini 0.8 – 2024-12-19LLM plugin to access Google’s Gemini family of models

datasette-enrichments-slow 0.1 – 2024-12-18An enrichment on a slow loop to help debug progress bars

llm-anthropic 0.11 – 2024-12-17LLM access to models by Anthropic, including the Claude series

llm-openrouter 0.3 – 2024-12-08LLM plugin for models hosted by OpenRouter

prompts-js 0.0.4 – 2024-12-08async alternatives to browser alert() and prompt() and confirm()

datasette-enrichments-llm 0.1a0 – 2024-12-05Enrich data by prompting LLMs

llm 0.19.1 – 2024-12-05Access large language models from the command-line

llm-bedrock 0.4 – 2024-12-04Run prompts against models hosted on AWS Bedrock

datasette-queries 0.1a0 – 2024-12-03Save SQL queries in Datasette

datasette-llm-usage 0.1a0 – 2024-12-02Track usage of LLM tokens in a SQLite table

llm-mistral 0.9 – 2024-12-02LLM plugin providing access to Mistral models using the Mistral API

llm-claude-3 0.10 – 2024-12-02LLM plugin for interacting with the Claude 3 family of models

datasette 0.65.1 – 2024-11-29An open source multi-tool for exploring and publishing data

sqlite-utils-ask 0.2 – 2024-11-24Ask questions of your data with LLM assistance

sqlite-utils 3.38 – 2024-11-23Python CLI utility and library for manipulating SQLite databases

TILs

Fixes for datetime UTC warnings in Python – 2024-12-12

Publishing a simple client-side JavaScript package to npm with GitHub Actions – 2024-12-08

GitHub OAuth for a static site using Cloudflare Workers – 2024-11-29

Tags: google, ai, weeknotes, openai, generative-ai, chatgpt, llms, gemini, o1

AI Summary and Description: Yes

**Summary:** The text discusses recent advancements and releases in large language models (LLMs) from various tech giants, highlighting notable launches from Amazon, Google, and OpenAI. It underscores the competitive landscape of AI development, particularly the efficiencies gained in LLMs, and the rapid innovation cycle.

**Detailed Description:**

The content provides a high-level overview of the current state of large language model (LLM) development and the competitive dynamics between major AI players. The focus is on how companies like Amazon, Google, and OpenAI are expanding their offerings and capabilities in LLM technologies.

Key points include:

– **Amazon’s Nova Models**:
– Launched the Amazon Nova family of multi-modal models on December 4th, aimed at competing with Google’s Gemini series.
– Introduction of the llm-bedrock plugin facilitating interaction with these new models.

– **Meta’s Llama 3.3 70B-Instruct**:
– Released on December 6th, claimed to offer competitive performance relative to larger models, fueling discussions about efficiency.

– **Efficiency Improvements**:
– The narrative reflects a significant trend towards efficiency in model capabilities, mentioning how models have become more performant.
– The decrease in cost for proprietary hosted LLMs is attributed to competition and enhanced model efficiency.

– **Google’s Gemini 2.0 Series**:
– Gemini 2.0 Flash was launched with impressive streaming capabilities, emerging ahead of OpenAI’s updates in the same timeframe.
– Google’s “Thinking mode” feature showcases advanced inference scaling methods.

– **OpenAI’s Competitive Response**:
– Highlights the historical competition between OpenAI and Google, noting how Google’s recent updates have positioned them favorably in the market.

– **Future Directions**:
– The writer anticipates OpenAI’s forthcoming release to potentially close the year on a strong note, indicating ongoing competition and rapid advancements.

– **Additional Releases and Plugins**:
– The text lists various other software releases and plugins related to LLMs, emphasizing the surge of tools being developed to explore and manipulate data efficiently using these models.

This overview is significant for professionals in AI, cloud computing, and security sectors by highlighting both the competitive landscape and rapid advancements in LLM technologies. Understanding the pace of these developments is crucial for aligning business strategies, addressing security considerations, and ensuring compliance in rapidly evolving AI environments.

.NET 0 Flash 1 2 2024 3 4 a access Act advancement advancements AGI AI AI development Amazon Anthropic anti API art as async audio AWS Bedrock browser Bug business by C capabilities Cerebras chat ChatGPT Claude Claude-3 CleaR Cloud cloud computing Cloudflare Cloudflare Workers code command communication companies Competition competitive dynamics competitive landscape compliance Computing core cost Current D data database databases dataset datasette day DeFi demo development e EDR efficiency efficiency improvement efficiency improvements efficient end environment event Excel exp fast features first for full future future directions g Gemini Gemini 1.5 Gemini 2 Gemini 2.0 Gen generative git GitHub GitHub Actions Go Google GPT Groq gs high Highlight http HTTPS human in Inference inference scaling innovation inter interaction IRS ite Java JavaScript json Just k l language language model language models large large language model large language models led library Lite llama llm llms lm local models loop low market Meta Mila Mistral modal model model capabilities model efficiency models multi my Narrativ native next no npm o o1 o1 model o1-preview OAuth of off on one open openai over performance plugin plugins pre preparation Preview professionals Progress projects prompt prompts publishing Py Python question R rack rag RCE real releases response Rock s scaling sec security security considerations self side Sig Sim Simple software source sql sqlite SSE state T tech tech giants technologies terminal text the to token tokens tools Tor TP trie up update updates upgrade US usage uth utils uv val town vendor vendors Veo Veo 2 video video streaming voice web weeknotes Wi Wind workers x