Source URL: https://simonwillison.net/2025/Jul/22/gemini-25-flash-lite/#atom-everything
Source: Simon Willison’s Weblog
Title: Gemini 2.5 Flash-Lite is now stable and generally available
Feedly Summary: Gemini 2.5 Flash-Lite is now stable and generally available
The last remaining member of the Gemini 2.5 trio joins Pro and Flash in General Availability today.
Gemini 2.5 Flash-Lite is the cheapest of the 2.5 family, at $0.10/million input tokens and $0.40/million output tokens. This puts it equal to GPT-4.1 Nano on my llm-prices.com comparison table.
The preview version of that model had the same pricing for text tokens, but is now cheaper for audio:
We have also reduced audio input pricing by 40% from the preview launch.
I released llm-gemini 0.24 with support for the new model alias:
llm install -U llm-gemini
llm -m gemini-2.5-flash-lite \
-a https://static.simonwillison.net/static/2024/pelican-joke-request.mp3
I wrote more about the Gemini 2.5 Flash-Lite preview model last month.
Tags: google, ai, generative-ai, llms, llm, gemini, llm-pricing, llm-release
AI Summary and Description: Yes
Summary: The text discusses the launch of Gemini 2.5 Flash-Lite, a new model in the Gemini 2.5 family, which is now available for general use. It highlights pricing changes, particularly a reduction in audio input costs, and mentions the release of a supporting software version for this model.
Detailed Description: The announcement about Gemini 2.5 Flash-Lite provides insights into recent developments in large language models (LLMs), particularly in terms of accessibility and cost efficiency for developers and businesses. Here are the major points and implications of the release:
– **General Availability**: Gemini 2.5 Flash-Lite has reached general availability, indicating readiness for deployment in real-world applications.
– **Pricing Strategy**:
– The cost for using the Flash-Lite model is set at $0.10 per million input tokens and $0.40 per million output tokens, positioning it competitively against models like GPT-4.1 Nano.
– A noteworthy 40% reduction in audio input pricing enhances its attractiveness, especially for applications requiring audio processing capabilities.
– **Software Support**: The introduction of the llm-gemini 0.24 module facilitates easier integration for developers looking to leverage the new model.
– **Model Comparison**: This launch is part of a broader trend in LLM pricing, where competitive costs could drive wider adoption across various sectors.
**Key Implications for Professionals**:
– **Cost Efficiency**: Organizations looking to implement AI solutions may find Gemini 2.5 Flash-Lite an appealing option due to its affordability, which can lead to lower operational costs for AI projects.
– **Audio Processing Capabilities**: The reduction in pricing for audio inputs may stimulate innovation in applications that involve voice recognition or audio generation, impacting sectors such as customer service and content creation.
– **Development Tools**: The release of the llm-gemini helper module simplifies adoption and integration, crucial for a successful deployment in cloud environments or within DevSecOps practices.
Overall, the launch of Gemini 2.5 Flash-Lite presents significant opportunities and shifts in the AI landscape, especially in terms of pricing strategies and expected improvements in large language model functionalities.