Source URL: https://simonwillison.net/2025/Apr/17/start-building-with-gemini-25-flash/
Source: Simon Willison’s Weblog
Title: Start building with Gemini 2.5 Flash
Feedly Summary: Start building with Gemini 2.5 Flash
Google Gemini’s latest model is Gemini 2.5 Flash, available in (paid) preview as gemini-2.5-flash-preview-04-17.
Building upon the popular foundation of 2.0 Flash, this new version delivers a major upgrade in reasoning capabilities, while still prioritizing speed and cost. Gemini 2.5 Flash is our first fully hybrid reasoning model, giving developers the ability to turn thinking on or off. The model also allows developers to set thinking budgets to find the right tradeoff between quality, cost, and latency.
I added support to the new model in llm-gemini 0.18. Here’s how to try it out:
llm install -U llm-gemini
llm -m gemini-2.5-flash-preview-04-17 ‘Generate an SVG of a pelican riding a bicycle’
Here’s that first pelican, using the default setting where Gemini Flash 2.5 makes its own decision in terms of how much “thinking" effort to apply:
Here’s the transcript. This one used 11 input tokens and 4266 output tokens of which 2702 were "thinking" tokens.
I asked the model to "describe" that image and it could tell it was meant to be a pelican:
A simple illustration on a white background shows a stylized pelican riding a bicycle. The pelican is predominantly grey with a black eye and a prominent pink beak pouch. It is positioned on a black line-drawn bicycle with two wheels, a frame, handlebars, and pedals.
AI Studio indicates that the model is charged at $0.15/million tokens for input, $3.50/million for thinking output and $0.60/million for non-thinking output. I presume that means thinking tokens are charged separately. I calculate this prompt as costing 1.04 cents.
Let’s try again with thinking disabled:
llm -m gemini-2.5-flash-preview-04-17 ‘Generate an SVG of a pelican riding a bicycle’ -o thinking_budget 0
11 input, 1705 output of which 25 were "thinking". Transcript here. 0.11 cents.
Its self-description:
A minimalist illustration shows a bright yellow bird riding a bicycle. The bird has a simple round body, small wings, a black eye, and an open orange beak. It sits atop a simple black bicycle frame with two large circular black wheels. The bicycle also has black handlebars and black and yellow pedals. The scene is set against a solid light blue background with a thick green stripe along the bottom, suggesting grass or ground.
And finally, let’s ramp the thinking budget up to the maximum:
llm -m gemini-2.5-flash-preview-04-17 ‘Generate an SVG of a pelican riding a bicycle’ -o thinking_budget 24576
I think it over-thought this one! Transcript – 5174 output tokens of which 3023 were thinking. 1.19 cents. Its own self-evaluation:
A simple, cartoon-style drawing shows a bird-like figure riding a bicycle. The figure has a round gray head with a black eye and a large, flat orange beak with a yellow stripe on top. Its body is represented by a curved light gray shape extending from the head to a smaller gray shape representing the torso or rear. It has simple orange stick legs with round feet or connections at the pedals. The figure is bent forward over the handlebars in a cycling position. The bicycle is drawn with thick black outlines and has two large wheels, a frame, and pedals connected to the orange legs. The background is plain white, with a dark gray line at the bottom representing the ground.
One thing I really appreciate about Gemini 2.5 Flash’s approach to SVGs is that it shows very good taste in CSS, comments and general SVG class structure. Here’s a truncated extract – I run a lot of these SVG tests against different models and this one has a coding style that I particularly enjoy. (Gemini 2.5 Pro does this too).
AI Summary and Description: Yes
**Summary:** Google has launched Gemini 2.5 Flash, an advanced AI model with enhanced reasoning capabilities that allows developers to manage “thinking” budgets. This model prioritizes cost-efficiency while delivering high-quality outputs, making it significant for developers in AI and cloud services focusing on graphics generation and intelligent resource management.
**Detailed Description:**
– **Gemini 2.5 Flash Overview:**
– This is a new model from Google, launching in a paid preview phase, building on the successful Gemini 2.0 Flash.
– It introduces a hybrid reasoning approach, enabling developers to control the model’s cognitive workload.
– Significant upgrades include improved reasoning and a balance between quality, cost, and latency.
– **Features:**
– **Thinking Capability Management:** Developers can turn the model’s reasoning on or off based on user-defined thinking budgets.
– **Cost Structure:** The model’s pricing varies by type of token usage:
– $0.15 per million tokens for input
– $3.50 per million tokens for thinking output
– $0.60 per million tokens for non-thinking output
– An example demonstration shows the detailed token usage and costs for different levels of reasoning effort.
– **Practical Demonstrations:**
– The text includes specific commands for implementing and trying the model and showcases the different outputs based on varying thinking budgets.
– Both outputs at default and with thinking disabled provide a clear look at the model’s performance.
– **Coding Style Appreciation:**
– The SVG generated by the model is praised for its clean CSS and structured comments, indicating quality not just in output imagery but also in code style and maintainability.
– **Implications for Professionals:**
– This new model’s features cater to cloud computing and AI development, particularly for professionals involved in implementing large language models (LLMs).
– The ability to manage reasoning budgets could lead to optimized resource utilization in cloud environments, directly benefiting DevSecOps and MLOps practices.
– As AI models become more sophisticated, the necessity for secure and compliant deployment becomes ever more critical, making Gemini 2.5 Flash’s capabilities relevant in ensuring effective AI governance.
This announcement from Google signals an important advancement in the capabilities of AI models, with potential applications in graphics generation, resource optimization, and cost management in cloud computing environments.