Simon Willison’s Weblog: Gemini 2.5: Our most intelligent models are getting even better

Source URL: https://simonwillison.net/2025/May/20/gemini-25/#atom-everything
Source: Simon Willison’s Weblog
Title: Gemini 2.5: Our most intelligent models are getting even better

Feedly Summary: Gemini 2.5: Our most intelligent models are getting even better
A bunch of new Gemini 2.5 announcements at Google I/O today.
2.5 Flash and 2.5 Pro are both getting audio output (previously previewed in Gemini 2.0) and 2.5 Pro is getting an enhanced reasoning mode called “Deep Think" – not yet available via the API.
Available today is the latest Gemini 2.5 Flash model, gemini-2.5-flash-preview-05-20. I added support to that in llm-gemini 0.20 (and, if you’re using the LLM tool-use alpha, llm-gemini 0.20a2).
I tried it out on my personal benchmark, as seen in the Google I/O keynote!
llm -m gemini-2.5-flash-preview-05-20 ‘Generate an SVG of a pelican riding a bicycle’

Here’s what I got from the default model, with its thinking mode enabled:

Full transcript. 11 input tokens, 2,619 output tokens, 10,391 thinking tokens = 4.5537 cents.
I ran the same thing again with -o thinking_budget 0 to turn off thinking mode entirely, and got this:

Full transcript. 11 input, 1,243 output = 0.0747 cents.
The non-thinking model is priced differently – still $0.15/million for input but $0.60/million for output as opposed to $3.50/million for thinking+output. The pelican it drew was 61x cheaper!
Finally, inspired by the keynote I ran this follow-up prompt to animate the more expensive pelican:
llm –cid 01jvqjqz9aha979yemcp7a4885 ‘Now animate it’

This one is pretty great!

Tags: llm-release, gemini, llm, google, generative-ai, pelican-riding-a-bicycle, ai, llm-reasoning, llm-pricing

AI Summary and Description: Yes

Summary: The text discusses the latest updates and features of Google’s Gemini 2.5 models, highlighting enhancements in audio output and reasoning capabilities that could impact the field of generative AI. It emphasizes the pricing structure related to the new features, making it relevant for professionals in AI and cloud computing.

Detailed Description: The provided text outlines several significant advancements in Google’s Gemini 2.5 generative AI models, particularly focusing on enhancements that may affect AI security, cloud computing, and the efficiency of AI application costs. Key points include:

– **New Model Announcements**: Gemini 2.5 Flash and Gemini 2.5 Pro models were introduced with notable enhancements.
– **Audio Output**: Both models now support audio output capabilities, which expands their application potential.
– **”Deep Think” Mode**: The enhanced reasoning mode in Gemini 2.5 Pro is called “Deep Think,” which allows for more complex outputs, although it is not yet available via an API, indicating a potential future limitation on accessibility for developers.
– **Performance Benchmarking**: The text includes personal benchmarking results demonstrating the output capabilities of the models:
– Using the default model with “thinking mode” led to higher token counts and costs for generation, while turning off “thinking mode” resulted in significantly cheaper outputs.
– **Pricing Structure**: Distinct pricing tiers are presented for input and output, with a notably high cost for using the “thinking mode,” which could influence budget considerations for companies using these models for various applications.

Overall, the developments mentioned could have implications for security, especially regarding data handling and costs associated with deploying advanced generative AI models in cloud environments, appealing to AI security and infrastructure professionals.

– **Key insights for professionals**:
– The evolution of AI models, particularly those that incorporate reasoning capabilities, may introduce new security implications and consider cost structure to optimize spending.
– Understanding the detailed pricing and performance metrics is essential for budgeting in AI operations.

This analysis underscores the importance of these AI advancements in strategic decision-making regarding AI model deployment, security, and compliance in various industry applications.