Tag: token count
-
Tomasz Tunguz: Adding Complexity Reduced My AI Cost by 41%
Source URL: https://www.tomtunguz.com/adding-complexity-reduced-my-ai-cost-by-41-percent/ Source: Tomasz Tunguz Title: Adding Complexity Reduced My AI Cost by 41% Feedly Summary: I discovered I was designing my AI tools backwards. Here’s an example. This was my newsletter processing chain : reading emails, calling a newsletter processor, extracting companies, & then adding them to the CRM. This involved four different…
-
Cloud Blog: BigQuery under the hood: Scalability, reliability and usability enhancements for gen AI inference
Source URL: https://cloud.google.com/blog/products/data-analytics/bigquery-enhancements-to-boost-gen-ai-inference/ Source: Cloud Blog Title: BigQuery under the hood: Scalability, reliability and usability enhancements for gen AI inference Feedly Summary: People often think of BigQuery in the context of data warehousing and analytics, but it is a crucial part of the AI ecosystem as well. And today, we’re excited to share significant performance…
-
Simon Willison’s Weblog: Google Gemini URL Context
Source URL: https://simonwillison.net/2025/Aug/18/google-gemini-url-context/ Source: Simon Willison’s Weblog Title: Google Gemini URL Context Feedly Summary: Google Gemini URL Context New feature in the Gemini API: you can now enable a url_context tool which the models can use to request the contents of URLs as part of replying to a prompt. I released llm-gemini 0.25 with a…
-
Tomasz Tunguz: The Surprising Input-to-Output Ratio of AI Models
Source URL: https://www.tomtunguz.com/input-output-ratio/ Source: Tomasz Tunguz Title: The Surprising Input-to-Output Ratio of AI Models Feedly Summary: When you query an AI model, it gathers relevant information to generate an answer. For a while, I’ve wondered : how much information does the model need to answer a question? I thought the output would be larger, however…
-
Simon Willison’s Weblog: Trying out the new Gemini 2.5 model family
Source URL: https://simonwillison.net/2025/Jun/17/gemini-2-5/ Source: Simon Willison’s Weblog Title: Trying out the new Gemini 2.5 model family Feedly Summary: After many months of previews, Gemini 2.5 Pro and Flash have reached general availability with new, memorable model IDs: gemini-2.5-pro and gemini-2.5-flash. They are joined by a new preview model with an unmemorable name: gemini-2.5-flash-lite-preview-06-17 is a…
-
Simon Willison’s Weblog: llm-mistral 0.14
Source URL: https://simonwillison.net/2025/May/29/llm-mistral-014/#atom-everything Source: Simon Willison’s Weblog Title: llm-mistral 0.14 Feedly Summary: llm-mistral 0.14 I added tool-support to my plugin for accessing the Mistral API from LLM today, plus support for Mistral’s new Codestral Embed embedding model. An interesting challenge here is that I’m not using an official client library for llm-mistral – I rolled…
-
Simon Willison’s Weblog: Highlights from the Claude 4 system prompt
Source URL: https://simonwillison.net/2025/May/25/claude-4-system-prompt/ Source: Simon Willison’s Weblog Title: Highlights from the Claude 4 system prompt Feedly Summary: Anthropic publish most of the system prompts for their chat models as part of their release notes. They recently shared the new prompts for both Claude Opus 4 and Claude Sonnet 4. I enjoyed digging through the prompts,…
-
Cloud Blog: Announcing Anthropic’s Claude Opus 4 and Claude Sonnet 4 on Vertex AI
Source URL: https://cloud.google.com/blog/products/ai-machine-learning/anthropics-claude-opus-4-and-claude-sonnet-4-on-vertex-ai/ Source: Cloud Blog Title: Announcing Anthropic’s Claude Opus 4 and Claude Sonnet 4 on Vertex AI Feedly Summary: Today, we’re expanding the choice of third-party models available in Vertex AI Model Garden with the addition of Anthropic’s newest generation of the Claude model family: Claude Opus 4 and Claude Sonnet 4. Both…
-
Simon Willison’s Weblog: Gemini 2.5: Our most intelligent models are getting even better
Source URL: https://simonwillison.net/2025/May/20/gemini-25/#atom-everything Source: Simon Willison’s Weblog Title: Gemini 2.5: Our most intelligent models are getting even better Feedly Summary: Gemini 2.5: Our most intelligent models are getting even better A bunch of new Gemini 2.5 announcements at Google I/O today. 2.5 Flash and 2.5 Pro are both getting audio output (previously previewed in Gemini…
-
Simon Willison’s Weblog: llm-gemini 0.19.1
Source URL: https://simonwillison.net/2025/May/8/llm-gemini-0191/#atom-everything Source: Simon Willison’s Weblog Title: llm-gemini 0.19.1 Feedly Summary: llm-gemini 0.19.1 Bugfix release for my llm-gemini plugin, which was recording the number of output tokens (needed to calculate the price of a response) incorrectly for the Gemini “thinking" models. Those models turn out to return candidatesTokenCount and thoughtsTokenCount as two separate values…