Simon Willison’s Weblog: llm-gemini 0.19.1

May 8, 2025

—

Source URL: https://simonwillison.net/2025/May/8/llm-gemini-0191/#atom-everything
Source: Simon Willison’s Weblog
Title: llm-gemini 0.19.1

Feedly Summary: llm-gemini 0.19.1
Bugfix release for my llm-gemini plugin, which was recording the number of output tokens (needed to calculate the price of a response) incorrectly for the Gemini “thinking" models. Those models turn out to return candidatesTokenCount and thoughtsTokenCount as two separate values which need to be added together to get the total billed output token count. Full details in this issue.
I spotted this potential bug in this response log this morning, and my concerns were confirmed when Paul Gauthier wrote about a similar fix in Aider in Gemini 2.5 Pro Preview 03-25 benchmark cost, where he noted that the $6.32 cost recorded to benchmark Gemini 2.5 Pro Preview 03-25 was incorrect. Since that model is no longer available (despite the date-based model alias persisting) Paul is not able to accurately calculate the new cost, but it’s likely a lot more since the Gemini 2.5 Pro Preview 05-06 benchmark cost $37.
Tags: paul-gauthier, gemini, llm, aider, generative-ai, llm-pricing, ai, llms

AI Summary and Description: Yes

Summary: The text discusses a bug fix in the llm-gemini plugin related to the miscalculation of output tokens for Gemini models. This issue has implications for billing accuracy in generative AI applications, highlighting the importance of precise token accounting in AI system costs.

Detailed Description:
This text pertains to the realm of AI, specifically focusing on issues related to Large Language Models (LLMs) and their operational aspects, which directly affect AI Security and Software Security. Key points from the text include:

– **Bugfix Release**: The plugin ‘llm-gemini’ underwent a bugfix to accurately record output tokens from Gemini models.
– **Miscalculation Issue**: It was identified that the plugin was incorrectly recording the number of output tokens due to a failure in combining two separate values (candidatesTokenCount and thoughtsTokenCount) that contribute to the total billed output.
– **Pricing Implications**: The correct total token count is critical since it directly impacts user fees. The text notes differences in pricing based on versions of the Gemini model benchmarks, emphasizing financial consequences for users.
– **Signal of Broader Issues**: The mention of similar issues encountered by another user, Paul Gauthier, indicates that these discrepancies are likely symptomatic of larger challenges in the accuracy of AI pricing models, especially in generative contexts.
– **Importance for Compliance and Security**: Accurate cost accounting is not only a financial concern but also ties to governance and compliance within organizations that rely on precise budgets for AI expenditures and system usage.

This analysis highlights how interoperability and functionality debugging in AI models can have cascading effects on billing and compliance, reinforcing the need for robust security practices and thorough validation processes within AI operations and financial planning.

– **Recommendation for Professionals**:
– Regularly audit billing functionalities in AI applications to ensure accurate cost management.
– Implement validation checks to catch such discrepancies early, enhancing both performance and trust in generative AI systems.
– Stay informed about updates and changes in AI models that could affect both pricing and functionality, ensuring compliance with budgeting and financial forecasting.

.NET 01 1 2 2025 3 5 5 Pro 7 a account accounting accuracy Act AI AI applications ai model AI models AI security AI systems aider analysis and app Application applications as audit based benchmark benchmarks Bi budget budgeting Bug by C CERN challenges CI CIA co compliance concerns Context cost cost management Costs critical D de Debugging e end exp fail fees financial financial forecasting for forecasting full function functionality g Gemini Gemini 2 Gemini model Gemini models Gen generative Generative AI Go governance gs H high Highlight http HTTPS implications in inter interoperability issue ite k Key l language language model language models large large language model large language models Large Language Models (LLMs) led Li llm llm-pricing llms lm long M man management Mila mini Mode model models my N no notes o of on only operation operations organization organizations out output over performance planning plugin point potential pre Preview price pricing pricing implications pricing model process processes professionals Q R rate RCE real record red release response return Ro robust security robust security practices RoT Rust s sec security security practices sequence Sig Signal Sim software software security source specific SSE system systems T Tags: Tails test text the thinking Thought to token token accounting token count tokens TP trust turn two under up update updates US usage use user Users uth V val Validation validation checks validation processes version Ware web Wi x