Source URL: https://simonwillison.net/2025/Mar/12/notes-on-googles-gemma-3/
Source: Simon Willison’s Weblog
Title: Notes on Google’s Gemma 3
Feedly Summary: Google’s Gemma team released an impressive new model today (under their not-open-source Gemma license). Gemma 3 comes in four sizes – 1B, 4B, 12B, and 27B – and while 1B is text-only the larger three models are all multi-modal for vision:
Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles context windows up to 128k tokens, understands over 140 languages, and offers improved math, reasoning, and chat capabilities, including structured outputs and function calling.
I tried out the largest model using the latest Ollama – this is the second time I’ve spotted a major model release partnering with Ollama on launch day, the first being Mistral Small 3.
I ran this (after upgrading Ollama through their menu icon upgrade option):
ollama pull gemma3:27b
That pulled 17GB of model weights. I’ve been trying it out using LLM and llm-ollama:
llm install llm-ollama
llm -m gemma3:27b ‘Build a single page HTML+CSS+JavaScript UI that gives me a large textarea for writing in which constantly saves what I have entered to localStorage (restoring when I reload the page) and displays a word counter’
That was a replay of a prompt I ran against Claude Artifacts a few months ago. Here’s what Gemma built, and the full chat transcript. It’s a simple example but it worked just right.
Something I’ve been curious about recently is longer context support: how well can a local model on my laptop deal with summarization or data extraction tasks against longer pieces of text?
I decided to try my Hacker News summarize script using Gemma, against the thread there discussing the Gemma 3 technical paper.
First I did a quick token count (using the OpenAI tokenizer but it’s usually a similar number to other models):
curl ‘https://hn.algolia.com/api/v1/items/43340491’ | ttok
This returned 22,260 – well within Gemma’s documented limits but still a healthy number considering just last year most models topped out at 4,000 or 8,000.
I ran my script like this:
hn-summary.sh 43340491 -m gemma3:27b
It did a pretty good job! Here’s the full prompt and response. The one big miss is that it ignored my instructions to include illustrative quotes – I don’t know if modifying the prompt will fix that but it’s disappointing that it didn’t handle that well, given how important direct quotes are for building confidence in RAG-style responses.
Here’s what I got for Generate an SVG of a pelican riding a bicycle:
llm -m gemma3:27b ‘Generate an SVG of a pelican riding a bicycle’
https://static.simonwillison.net/static/2025/gemma-3-pelican.svg
You can also try out the new Gemma in Google AI Studio, and via their API. I added support for it to llm-gemini 0.15, though sadly it appears vision mode doesn’t work with that API hosted model yet.
llm install -U llm-gemini
llm keys set gemini
# paste key here
llm -m gemma-3-27b-it ‘five facts about pelicans of interest to skunks’
Here’s what I got. I’m not sure how pricing works for that hosted model yet.
Gemma 3 is also already available through MLX-VLM – here’s their model collection – but I haven’t tried that version yet.
AI Summary and Description: Yes
Summary: The text describes the release of Google’s Gemma 3 model, a significant multimodal AI tool that can handle extensive context windows and supports various languages. This model represents a notable advancement in AI capabilities and raises interesting implications for professionals working in AI, especially regarding operational integration and use cases in LLM security and compliance.
Detailed Description:
The announcement that Google’s Gemma team has launched Gemma 3, an advanced AI model, provides significant insights into the evolving capabilities of AI systems. This model is particularly relevant for professionals in the fields of AI, cloud computing, and security, as it relates to various aspects of AI security and operational utilization. Here are the critical highlights:
– **Model Specifications**:
– Gemma 3 is available in four sizes: 1B, 4B, 12B, and 27B.
– The smaller 1B model focuses solely on text, while the larger models (4B, 12B, and 27B) support multimodal inputs, integrating both vision and text.
– It offers a context window supporting up to 128,000 tokens, significantly surpassing previous limitations common in most models.
– **Language and Functionality**:
– The model can understand over 140 languages and exhibits enhanced capabilities in math and reasoning, showcasing potential for varied applications in enterprise settings.
– Gemma 3 also includes structured outputs and function calling, which are critical for developers looking to integrate AI functionalities into larger systems.
– **User Experience**:
– Users can experience Gemma 3’s capabilities through platforms like Ollama, allowing for easy implementation and experimentation.
– Examples shared demonstrate the model’s application for creating web UI components and summarizing extensive textual content, underscoring its versatility.
– **Implications for Security and Compliance**:
– The introduction of such powerful AI capabilities raises questions around security, particularly concerning data privacy and compliance with regulations.
– The model’s ability to handle extensive user inputs might necessitate new compliance frameworks to ensure responsible use and data handling.
– Since the model supports an API and integrates with existing tools, professionals will need to address security concerns related to its deployment and operational integrity.
– **Challenges Noted**:
– While performing a summarization task, the model missed instructions for including illustrative quotes, highlighting areas for improvement in prompt responsiveness and task accuracy.
Overall, Gemma 3 signifies an important step forward in AI model development and its operational usage, presenting both opportunities and challenges in security and compliance that professionals will need to navigate as they integrate such technologies into their systems.