Simon Willison’s Weblog: Gemini 2.5 Pro Preview pricing

Source URL: https://simonwillison.net/2025/Apr/4/gemini-25-pro-pricing/
Source: Simon Willison’s Weblog
Title: Gemini 2.5 Pro Preview pricing

Feedly Summary: Gemini 2.5 Pro Preview pricing
Google’s Gemini 2.5 Pro is currently the top model on LM Arena and, from my own testing, a superb model for OCR, audio transcription and long-context coding.
You can now pay for it!
The new gemini-2.5-pro-preview-03-25 model ID is priced like this:

Prompts less than 200,00 tokens: $1.25/million tokens for input, $10/million for output
Prompts more than 200,000 tokens (up to the 1,048,576 max): $2.50/million for input, $15/million for output

This is priced at around the same level as Gemini 1.5 Pro ($1.25/$5 for input/output below 128,000 tokens, $2.50/$10 above 128,000 tokens), is cheaper than GPT-4o for shorter prompts ($2.50/$10) and is cheaper than Claude 3.7 Sonnet ($3/$15).
Gemini 2.5 Pro is a reasoning model, and invisible reasoning tokens are included in the output token count. I just tried prompting “hi" and it charged me 2 tokens for input and 623 for output, of which 613 were "thinking" tokens. That still adds up to just 0.6232 cents using my LLM pricing calculator which I updated to support the new model just now.
I released llm-gemini 0.17 this morning adding support for the new model:
llm install -U llm-gemini
llm -m gemini-2.5-pro-preview-03-25 hi

Note that the model continues to be available for free under the previous gemini-2.5-pro-exp-03-25 model ID:
llm -m gemini-2.5-pro-exp-03-25 hi

The free tier is "used to improve our products", the paid tier is not.
Rate limits for the paid model vary by tier – from 150/minute and 1,000/day for tier 1 (billing configured), 1,000/minute and 50,000/day for Tier 2 ($250 total spend) and 2,000/minute and unlimited/day for Tier 3 ($1,000 total spend).
Google are retiring the Gemini 2.0 Pro preview entirely in favour of 2.5.
Via @OfficialLoganK
Tags: gemini, llm, generative-ai, llm-pricing, ai, llms, inference-scaling, google

AI Summary and Description: Yes

Summary: The text discusses the pricing and features of Google’s Gemini 2.5 Pro, a leading AI model known for capabilities in OCR, audio transcription, and long-context coding. It highlights the comparison of pricing against previous models and competitors, revealing insights into the economic landscape of AI services which are pertinent for professionals in AI and cloud computing.

Detailed Description:

The text communicates critical information regarding the newly introduced Gemini 2.5 Pro by Google, presenting it not only in comparison to earlier versions but also to other competitive models in the realm of large language models (LLMs). Here are the major points extracted:

– **Model Pricing**:
– Pricing for the new model varies based on token count:
– **Less than 200,000 tokens**:
– Input: $1.25/million tokens
– Output: $10/million tokens
– **More than 200,000 tokens** (up to the max of 1,048,576 tokens):
– Input: $2.50/million tokens
– Output: $15/million tokens
– Comparatively, the Gemini 2.5 Pro is cheaper than GPT-4o and Claude 3.7 Sonnet for shorter prompts.

– **Model Capabilities**:
– The model is praised for its performance in OCR, audio transcription, and long-context coding. It is characterized as a “reasoning model,” suggesting advanced cognitive processing capabilities.
– Token usage example: Prompting the model with “hi” resulted in a charge of 2 tokens for input and 623 for output, with a significant portion attributed to “thinking” tokens.

– **Free vs. Paid Tiers**:
– The Gemini 2.5 Pro remains available for free under a previous model ID, which is intended to enhance product improvements.
– The paid tier’s rate limits differ based on the tier:
– Tier 1: 150/minute and 1,000/day
– Tier 2: 1,000/minute and 50,000/day ($250)
– Tier 3: 2,000/minute and unlimited/day ($1,000)

– **Retirement of Previous Model**:
– Google is phasing out the Gemini 2.0 Pro preview in favor of the 2.5 version, indicating a strategic shift in its offerings.

This information is particularly relevant for professionals in AI, cloud computing, and software development as it reflects trends in LLM pricing strategies, operational capabilities, and considerations for scaling AI solutions. Understanding these dynamics can assist practitioners in making informed decisions regarding the adoption and integration of AI technologies in their operations.