Source URL: https://simonwillison.net/2025/May/8/llm-gemini-0191/#atom-everything
Source: Simon Willison’s Weblog
Title: llm-gemini 0.19.1
Feedly Summary: llm-gemini 0.19.1
Bugfix release for my llm-gemini plugin, which was recording the number of output tokens (needed to calculate the price of a response) incorrectly for the Gemini “thinking" models. Those models turn out to return candidatesTokenCount and thoughtsTokenCount as two separate values which need to be added together to get the total billed output token count. Full details in this issue.
I spotted this potential bug in this response log this morning, and my concerns were confirmed when Paul Gauthier wrote about a similar fix in Aider in Gemini 2.5 Pro Preview 03-25 benchmark cost, where he noted that the $6.32 cost recorded to benchmark Gemini 2.5 Pro Preview 03-25 was incorrect. Since that model is no longer available (despite the date-based model alias persisting) Paul is not able to accurately calculate the new cost, but it’s likely a lot more since the Gemini 2.5 Pro Preview 05-06 benchmark cost $37.
Tags: paul-gauthier, gemini, llm, aider, generative-ai, llm-pricing, ai, llms
AI Summary and Description: Yes
Summary: The text discusses a bug fix in the llm-gemini plugin related to the miscalculation of output tokens for Gemini models. This issue has implications for billing accuracy in generative AI applications, highlighting the importance of precise token accounting in AI system costs.
Detailed Description:
This text pertains to the realm of AI, specifically focusing on issues related to Large Language Models (LLMs) and their operational aspects, which directly affect AI Security and Software Security. Key points from the text include:
– **Bugfix Release**: The plugin ‘llm-gemini’ underwent a bugfix to accurately record output tokens from Gemini models.
– **Miscalculation Issue**: It was identified that the plugin was incorrectly recording the number of output tokens due to a failure in combining two separate values (candidatesTokenCount and thoughtsTokenCount) that contribute to the total billed output.
– **Pricing Implications**: The correct total token count is critical since it directly impacts user fees. The text notes differences in pricing based on versions of the Gemini model benchmarks, emphasizing financial consequences for users.
– **Signal of Broader Issues**: The mention of similar issues encountered by another user, Paul Gauthier, indicates that these discrepancies are likely symptomatic of larger challenges in the accuracy of AI pricing models, especially in generative contexts.
– **Importance for Compliance and Security**: Accurate cost accounting is not only a financial concern but also ties to governance and compliance within organizations that rely on precise budgets for AI expenditures and system usage.
This analysis highlights how interoperability and functionality debugging in AI models can have cascading effects on billing and compliance, reinforcing the need for robust security practices and thorough validation processes within AI operations and financial planning.
– **Recommendation for Professionals**:
– Regularly audit billing functionalities in AI applications to ensure accurate cost management.
– Implement validation checks to catch such discrepancies early, enhancing both performance and trust in generative AI systems.
– Stay informed about updates and changes in AI models that could affect both pricing and functionality, ensuring compliance with budgeting and financial forecasting.