Simon Willison’s Weblog: Start building with Gemini 2.5 Flash

Apr 17, 2025

—

Source URL: https://simonwillison.net/2025/Apr/17/start-building-with-gemini-25-flash/
Source: Simon Willison’s Weblog
Title: Start building with Gemini 2.5 Flash

Feedly Summary: Start building with Gemini 2.5 Flash
Google Gemini’s latest model is Gemini 2.5 Flash, available in (paid) preview as gemini-2.5-flash-preview-04-17.

Building upon the popular foundation of 2.0 Flash, this new version delivers a major upgrade in reasoning capabilities, while still prioritizing speed and cost. Gemini 2.5 Flash is our first fully hybrid reasoning model, giving developers the ability to turn thinking on or off. The model also allows developers to set thinking budgets to find the right tradeoff between quality, cost, and latency.

I added support to the new model in llm-gemini 0.18. Here’s how to try it out:
llm install -U llm-gemini
llm -m gemini-2.5-flash-preview-04-17 ‘Generate an SVG of a pelican riding a bicycle’

Here’s that first pelican, using the default setting where Gemini Flash 2.5 makes its own decision in terms of how much “thinking" effort to apply:

Here’s the transcript. This one used 11 input tokens and 4266 output tokens of which 2702 were "thinking" tokens.
I asked the model to "describe" that image and it could tell it was meant to be a pelican:

A simple illustration on a white background shows a stylized pelican riding a bicycle. The pelican is predominantly grey with a black eye and a prominent pink beak pouch. It is positioned on a black line-drawn bicycle with two wheels, a frame, handlebars, and pedals.

AI Studio indicates that the model is charged at $0.15/million tokens for input, $3.50/million for thinking output and $0.60/million for non-thinking output. I presume that means thinking tokens are charged separately. I calculate this prompt as costing 1.04 cents.
Let’s try again with thinking disabled:
llm -m gemini-2.5-flash-preview-04-17 ‘Generate an SVG of a pelican riding a bicycle’ -o thinking_budget 0

11 input, 1705 output of which 25 were "thinking". Transcript here. 0.11 cents.
Its self-description:

A minimalist illustration shows a bright yellow bird riding a bicycle. The bird has a simple round body, small wings, a black eye, and an open orange beak. It sits atop a simple black bicycle frame with two large circular black wheels. The bicycle also has black handlebars and black and yellow pedals. The scene is set against a solid light blue background with a thick green stripe along the bottom, suggesting grass or ground.

And finally, let’s ramp the thinking budget up to the maximum:
llm -m gemini-2.5-flash-preview-04-17 ‘Generate an SVG of a pelican riding a bicycle’ -o thinking_budget 24576

I think it over-thought this one! Transcript – 5174 output tokens of which 3023 were thinking. 1.19 cents. Its own self-evaluation:

A simple, cartoon-style drawing shows a bird-like figure riding a bicycle. The figure has a round gray head with a black eye and a large, flat orange beak with a yellow stripe on top. Its body is represented by a curved light gray shape extending from the head to a smaller gray shape representing the torso or rear. It has simple orange stick legs with round feet or connections at the pedals. The figure is bent forward over the handlebars in a cycling position. The bicycle is drawn with thick black outlines and has two large wheels, a frame, and pedals connected to the orange legs. The background is plain white, with a dark gray line at the bottom representing the ground.

One thing I really appreciate about Gemini 2.5 Flash’s approach to SVGs is that it shows very good taste in CSS, comments and general SVG class structure. Here’s a truncated extract – I run a lot of these SVG tests against different models and this one has a coding style that I particularly enjoy. (Gemini 2.5 Pro does this too).

<style>
.bike-frame { fill: none; stroke: #333; stroke-width: 8; stroke-linecap: round; stroke-linejoin: round; }
.wheel-rim { fill: none; stroke: #333; stroke-width: 8; }
.wheel-hub { fill: #333; }
/* … */
.pelican-body { fill: #d3d3d3; stroke: black; stroke-width: 3; }
.pelican-head { fill: #d3d3d3; stroke: black; stroke-width: 3; }
/* … */
</style>
<!– Ground Line –>
<line x1="0" y1="480" x2="800" y2="480" stroke="#555" stroke-width="5"/>
<!– Bicycle –>
<g id="bicycle">
<!– Wheels –>
<circle class="wheel-rim" cx="250" cy="400" r="70"/>
<circle class="wheel-hub" cx="250" cy="400" r="10"/>
<circle class="wheel-rim" cx="550" cy="400" r="70"/>
<circle class="wheel-hub" cx="550" cy="400" r="10"/>
<!– … –>
</g>
<!– Pelican –>
<g id="pelican">
<!– Body –>
<path class="pelican-body" d="M 440 330 C 480 280 520 280 500 350 C 480 380 420 380 440 330 Z"/>
<!– Neck –>
<path class="pelican-neck" d="M 460 320 Q 380 200 300 270"/>
<!– Head –>
<circle class="pelican-head" cx="300" cy="270" r="35"/>
<!– … –>
Tags: llm-release, gemini, llm, google, llm-reasoning, llm-pricing, llms, pelican-riding-a-bicycle

AI Summary and Description: Yes

**Summary:** Google has launched Gemini 2.5 Flash, an advanced AI model with enhanced reasoning capabilities that allows developers to manage “thinking” budgets. This model prioritizes cost-efficiency while delivering high-quality outputs, making it significant for developers in AI and cloud services focusing on graphics generation and intelligent resource management.

**Detailed Description:**

– **Gemini 2.5 Flash Overview:**
– This is a new model from Google, launching in a paid preview phase, building on the successful Gemini 2.0 Flash.
– It introduces a hybrid reasoning approach, enabling developers to control the model’s cognitive workload.
– Significant upgrades include improved reasoning and a balance between quality, cost, and latency.

– **Features:**
– **Thinking Capability Management:** Developers can turn the model’s reasoning on or off based on user-defined thinking budgets.
– **Cost Structure:** The model’s pricing varies by type of token usage:
– $0.15 per million tokens for input
– $3.50 per million tokens for thinking output
– $0.60 per million tokens for non-thinking output
– An example demonstration shows the detailed token usage and costs for different levels of reasoning effort.

– **Practical Demonstrations:**
– The text includes specific commands for implementing and trying the model and showcases the different outputs based on varying thinking budgets.
– Both outputs at default and with thinking disabled provide a clear look at the model’s performance.

– **Coding Style Appreciation:**
– The SVG generated by the model is praised for its clean CSS and structured comments, indicating quality not just in output imagery but also in code style and maintainability.

– **Implications for Professionals:**
– This new model’s features cater to cloud computing and AI development, particularly for professionals involved in implementing large language models (LLMs).
– The ability to manage reasoning budgets could lead to optimized resource utilization in cloud environments, directly benefiting DevSecOps and MLOps practices.
– As AI models become more sophisticated, the necessity for secure and compliant deployment becomes ever more critical, making Gemini 2.5 Flash’s capabilities relevant in ensuring effective AI governance.

This announcement from Google signals an important advancement in the capabilities of AI models, with potential applications in graphics generation, resource optimization, and cost management in cloud computing environments.

.NET 0 Flash 1 10 2 2025 24 3 3d 4 5 5 flash 5 Pro 7 800 a Act advanced AI advancement AI AI development AI governance ai model AI models and app Application applications art as based bicycle budget building by C capabilities CI CIA class CleaR Cloud cloud computing cloud environment cloud environments cloud service cloud services co code coding cognitive command Computing computing environments control cost cost management Cost Structure Costs critical css D de decision DeFi demo deployment developer developers development DevSecOps e effective efficiency end environment evaluation fault feature features fine first flash for full g Gemini Gemini 2 Gemini 2.0 Gen general generated generation Go Google Google Gemini governance grade graph graphics graphics generation gs H high http HTTPS hybrid hybrid reasoning Hybrid Reasoning Model image implications in Intel Iron IRS ite J Just k l language language model language models large large language model large language models Large Language Models (LLMs) latency led Li llm llm-pricing llms lm long low maintainability making man management max mini ML Mode model models N NCA no non NPU o oE of off on one open OPM opt optimization out output Outputs over pelican performance phi potential pre Preview pricing professionals prompt Q quality R rate Ray RCE real reasoning reasoning capabilities reasoning model red release resource resource management resource optimization resource utilization riding right Ro s s Position Sable sec SecOps secure self service services SHA Sig Signal Sim Simple solid source specific specific commands start structured support SVG T Tags: test text the Thought to token token usage tokens Tor TP trade turn two type UI up upgrade US usage use user utilization V val Valuation version web white Wi workload x