Tag: context window
-
Cloud Blog: More choice, more control: self-deploy proprietary models in your VPC with Vertex AI
Source URL: https://cloud.google.com/blog/products/ai-machine-learning/new-proprietary-models-vertex-model-garden/ Source: Cloud Blog Title: More choice, more control: self-deploy proprietary models in your VPC with Vertex AI Feedly Summary: Building the best AI applications requires both the freedom to choose the most powerful, specialized model for the task at hand, and a platform that can handle them all. This flexibility is core…
-
Simon Willison’s Weblog: Two more Chinese pelicans
Source URL: https://simonwillison.net/2025/Oct/1/two-pelicans/#atom-everything Source: Simon Willison’s Weblog Title: Two more Chinese pelicans Feedly Summary: Two new models from Chinese AI labs in the past few days. I tried them both out using llm-openrouter: DeepSeek-V3.2-Exp from DeepSeek. Announcement, Tech Report, Hugging Face (690GB, MIT license). As an intermediate step toward our next-generation architecture, V3.2-Exp builds upon…
-
Cloud Blog: Announcing Claude Sonnet 4.5 on Vertex AI
Source URL: https://cloud.google.com/blog/products/ai-machine-learning/announcing-claude-sonnet-4-5-on-vertex-ai/ Source: Cloud Blog Title: Announcing Claude Sonnet 4.5 on Vertex AI Feedly Summary: Today, we’re announcing the general availability of Claude Sonnet 4.5, Anthropic’s most intelligent model and its best-performing model for complex agents, coding, and computer use, on Vertex AI.Claude Sonnet 4.5 is built to work independently for hours, maintaining clarity…
-
AWS News Blog: Introducing Claude Sonnet 4.5 in Amazon Bedrock: Anthropic’s most intelligent model, best for coding and complex agents
Source URL: https://aws.amazon.com/blogs/aws/introducing-claude-sonnet-4-5-in-amazon-bedrock-anthropics-most-intelligent-model-best-for-coding-and-complex-agents/ Source: AWS News Blog Title: Introducing Claude Sonnet 4.5 in Amazon Bedrock: Anthropic’s most intelligent model, best for coding and complex agents Feedly Summary: Amazon Web Services announces Claude Sonnet 4.5 in Amazon Bedrock, featuring advanced capabilities in coding, tool handling, and long-horizon tasks, with improvements in memory management, context processing, and…
-
Cloud Blog: Deutsche Bank delivers AI-powered financial research with DB Lumina
Source URL: https://cloud.google.com/blog/topics/financial-services/deutsche-bank-delivers-ai-powered-financial-research-with-db-lumina/ Source: Cloud Blog Title: Deutsche Bank delivers AI-powered financial research with DB Lumina Feedly Summary: At Deutsche Bank Research, the core mission of our analysts is delivering original, independent economic and financial analysis. However, creating research reports and notes relies heavily on a foundation of painstaking manual work. Or at least that…
-
Simon Willison’s Weblog: Grok 4 Fast
Source URL: https://simonwillison.net/2025/Sep/20/grok-4-fast/ Source: Simon Willison’s Weblog Title: Grok 4 Fast Feedly Summary: Grok 4 Fast New hosted reasoning model from xAI that’s designed to be fast and extremely competitive on price. It has a 2 million token context window and “was trained end-to-end with tool-use reinforcement learning". It’s priced at $0.20/million input tokens and…
-
Simon Willison’s Weblog: Kimi-K2-Instruct-0905
Source URL: https://simonwillison.net/2025/Sep/6/kimi-k2-instruct-0905/#atom-everything Source: Simon Willison’s Weblog Title: Kimi-K2-Instruct-0905 Feedly Summary: Kimi-K2-Instruct-0905 New not-quite-MIT licensed model from Chinese Moonshot AI, a follow-up to the highly regarded Kimi-K2 model they released in July. This one is an incremental improvement – I’ve seen it referred to online as “Kimi K-2.1". It scores a little higher on a…
-
Cloud Blog: How Baseten achieves 225% better cost-performance for AI inference (and you can too)
Source URL: https://cloud.google.com/blog/products/ai-machine-learning/how-baseten-achieves-better-cost-performance-for-ai-inference/ Source: Cloud Blog Title: How Baseten achieves 225% better cost-performance for AI inference (and you can too) Feedly Summary: Baseten is one of a growing number of AI infrastructure providers, helping other startups run their models and experiments at speed and scale. Given the importance of those two factors to its customers,…
-
Simon Willison’s Weblog: Introducing gpt-realtime
Source URL: https://simonwillison.net/2025/Sep/1/introducing-gpt-realtime/#atom-everything Source: Simon Willison’s Weblog Title: Introducing gpt-realtime Feedly Summary: Introducing gpt-realtime Released a few days ago (August 28th), gpt-realtime is OpenAI’s new “most advanced speech-to-speech model". It looks like this is a replacement for the older gpt-4o-realtime-preview model that was released last October. This is a slightly confusing release. The previous realtime…
-
Simon Willison’s Weblog: too many model context protocol servers and LLM allocations on the dance floor
Source URL: https://simonwillison.net/2025/Aug/22/too-many-mcps/#atom-everything Source: Simon Willison’s Weblog Title: too many model context protocol servers and LLM allocations on the dance floor Feedly Summary: too many model context protocol servers and LLM allocations on the dance floor Useful reminder from Geoffrey Huntley of the infrequently discussed significant token cost of using MCP. Geoffrey estimate estimates that…