Simon Willison’s Weblog: Deep Think in the Gemini app

Aug 1, 2025

—

Source URL: https://simonwillison.net/2025/Aug/1/deep-think-in-the-gemini-app/
Source: Simon Willison’s Weblog
Title: Deep Think in the Gemini app

Feedly Summary: Deep Think in the Gemini app
Google released Gemini 2.5 Deep Think this morning, exclusively to their Ultra ($250/month) subscribers:

It is a variation of the model that recently achieved the gold-medal standard at this year’s International Mathematical Olympiad (IMO). While that model takes hours to reason about complex math problems, today’s release is faster and more usable day-to-day, while still reaching Bronze-level performance on the 2025 IMO benchmark, based on internal evaluations.

Google describe Deep Think’s architecture like this:

Just as people tackle complex problems by taking the time to explore different angles, weigh potential solutions, and refine a final answer, Deep Think pushes the frontier of thinking capabilities by using parallel thinking techniques. This approach lets Gemini generate many ideas at once and consider them simultaneously, even revising or combining different ideas over time, before arriving at the best answer.

This approach sounds a little similar to the llm-consortium plugin by Thomas Hughes, see this video from January’s Datasette Public Office Hours.
I don’t have an Ultra account, but thankfully nickandbro on Hacker News tried “Create a svg of a pelican riding on a bicycle" (a very slight modification of my prompt, which uses "Generate an SVG") and got back a very solid result:

The bicycle is the right shape, and this is one of the few results I’ve seen for this prompt where the bird is very clearly a pelican thanks to the shape of its beak.
There are more details on Deep Think in the Gemini 2.5 Deep Think Model Card (PDF). Some highlights from that document:

1 million token input window, accepting text, images, audio, and video.
Text output up to 192,000 tokens.
Training ran on TPUs and used JAX and ML Pathways.
"We additionally trained Gemini 2.5 Deep Think on novel reinforcement learning techniques that can leverage more multi-step reasoning, problem-solving and theorem-proving data, and we also provided access to a curated corpus of high-quality solutions to mathematics problems."
Knowledge cutoff is January 2025.

Via Hacker News
Tags: google, ai, generative-ai, llms, gemini, pelican-riding-a-bicycle, llm-reasoning, llm-release

AI Summary and Description: Yes

Summary: The text outlines the release of Google’s Gemini 2.5 Deep Think, an advanced AI model emphasizing novel reasoning capabilities and high versatility. This development is particularly relevant for professionals in AI and cloud computing security, given its implications for AI’s performance and application across various domains.

Detailed Description:

The release of Google’s Gemini 2.5 Deep Think signifies a strategic advancement in AI technologies, especially in the realm of generative AI and large language models (LLMs). Given its features and capabilities, professionals in AI security, cloud computing security, and infrastructure can glean several important insights from it:

– **Performance Benchmarking**: The model achieved noteworthy gold-medal performance in the International Mathematical Olympiad, indicating high competence in complex problem-solving.
– **Improved Efficiency**: Gemini 2.5 Deep Think allows for faster daily usability while performing at a respectable level compared to its predecessor, providing a practical application for professionals dealing with mathematical reasoning or complex data interpretive frameworks.
– **Innovative Architecture**:
– Utilizes parallel thinking techniques akin to human problem-solving methods.
– Generates multiple ideas simultaneously and can combine them for improved results—critical for scenarios needing quick decision-making or varied input analysis.

– **Robust Input Handling**:
– Capable of processing a 1 million token input window and supporting multiple formats (text, images, audio, video), making it a multifaceted tool for data scientists and AI engineers.

– **丰富的训练方法**:
– Benefits from reinforcement learning techniques which enhance multi-step reasoning and problem-solving.
– Access to a curated database of solutions boosts applicability in educational and research contexts.

– **Knowledge Cutoff**: The model retains a knowledge cutoff in January 2025, which highlights the importance of timely updates in rapidly evolving AI fields.

Overall, Google’s Gemini 2.5 Deep Think offers potential advancements that could be harnessed for security applications, enhanced compliance in AI outputs, and increased capabilities in infrastructure security through its novel thinking and reasoning methodologies. These advancements demand that security professionals consider not only traditional security measures but also how to safeguard and ensure compliance within generative AI frameworks.

.NET 1 2 2025 5 a access account Act advanced advanced AI advancement advancements age AI AI frameworks ai model AI security AI technologies analysis and Answer. API app Application applications Arch architecture Aria art as at ated audio based benchmark benchmarking benefits Best Bi bicycle boosts by C capabilities CI CIA CleaR Cloud cloud computing cloud computing security co complex problem compliance Computing Context critical cross D data data scientist data scientists database dataset datasette day de decision decision-making deep demand development document domain domains e edge education educational efficiency Engineer engineers ERP evaluation evaluations exp face fast feature features fine for framework frameworks front full g Gemini Gemini 2 Gemini app Gen generative Generative AI Go Google gs H hack hacker Hacker News handling high Highlight HR http HTTPS human image implications in Inforce infrastructure infrastructure security innovative architecture insights inter intern interpret io iOS ite J Jax Just k knowledge knowledge cutoff l language language model language models large large language model large language models Large Language Models (LLMs) learning learning techniques led level Li llm llms lm low M making man math math problem mathematical mathematical reasoning mathematics measures methodologies Mila mini ML Mode model model card models ModI multi my N nation new news no novel reinforcement learning NPU o of off on one only oost OPM oS out output Outputs over Parallel pdf pelican per performance performance benchmark performance benchmarking plugin porting potential pre pro problem problem-solving process processing professionals prompt ps public Q quality QUIC R rag rate RCE re real reasoning reasoning capabilities red reinforcement reinforcement learning release research riding right Ro RSA s Sable safe scientists search sec security security applications security measure security measures security professionals SHA side Sig Sim solid solutions solving source SSE SSO step reasoning strategic support SVG T Tags: Tails taking tech techniques technologies ted text the thinking Time to token tokens tool TP TPUs trained training trie UI Ultra up update updates US usability use V val Valuation versatility video web Wi Wind x z