Tag: parameter
-
Simon Willison’s Weblog: XBai o4
Source URL: https://simonwillison.net/2025/Aug/3/xbai-o4/#atom-everything Source: Simon Willison’s Weblog Title: XBai o4 Feedly Summary: XBai o4 Yet another open source (Apache 2.0) LLM from a Chinese AI lab. This model card claims: XBai o4 excels in complex reasoning capabilities and has now completely surpassed OpenAI-o3-mini in Medium mode. This a 32.8 billion parameter model released by MetaStone…
-
Simon Willison’s Weblog: Faster inference
Source URL: https://simonwillison.net/2025/Aug/1/faster-inference/ Source: Simon Willison’s Weblog Title: Faster inference Feedly Summary: Two interesting examples of inference speed as a flagship feature of LLM services today. First, Cerebras announced two new monthly plans for their extremely high speed hosted model service: Cerebras Code Pro ($50/month, 1,000 messages a day) and Cerebras Code Max ($200/month, 5,000/day).…
-
Simon Willison’s Weblog: More model releases on 31st July
Source URL: https://simonwillison.net/2025/Jul/31/more-models/ Source: Simon Willison’s Weblog Title: More model releases on 31st July Feedly Summary: Here are a few more model releases from today, to round out a very busy July: Cohere released Command A Vision, their first multi-modal (image input) LLM. Like their others it’s open weights under Creative Commons Attribution Non-Commercial, so…
-
Simon Willison’s Weblog: Trying out Qwen3 Coder Flash using LM Studio and Open WebUI and LLM
Source URL: https://simonwillison.net/2025/Jul/31/qwen3-coder-flash/ Source: Simon Willison’s Weblog Title: Trying out Qwen3 Coder Flash using LM Studio and Open WebUI and LLM Feedly Summary: Qwen just released their sixth model(!) for this July called Qwen3-Coder-30B-A3B-Instruct – listed as Qwen3-Coder-Flash in their chat.qwen.ai interface. It’s 30.5B total parameters with 3.3B active at any one time. This means…
-
Simon Willison’s Weblog: The best available open weight LLMs now come from China
Source URL: https://simonwillison.net/2025/Jul/30/chinese-models/ Source: Simon Willison’s Weblog Title: The best available open weight LLMs now come from China Feedly Summary: Something that has become undeniable this month is that the best available open weight models now come from the Chinese AI labs. I continue to have a lot of love for Mistral, Gemma and Llama…
-
Simon Willison’s Weblog: Qwen/Qwen3-30B-A3B-Instruct-2507
Source URL: https://simonwillison.net/2025/Jul/29/qwen3-30b-a3b-instruct-2507/ Source: Simon Willison’s Weblog Title: Qwen/Qwen3-30B-A3B-Instruct-2507 Feedly Summary: Qwen/Qwen3-30B-A3B-Instruct-2507 New model update from Qwen, improving on their previous Qwen3-30B-A3B release from late April. In their tweet they said: Smarter, faster, and local deployment-friendly. ✨ Key Enhancements: ✅ Enhanced reasoning, coding, and math skills ✅ Broader multilingual knowledge ✅ Improved long-context understanding (up…
-
Simon Willison’s Weblog: My 2.5 year old laptop can write Space Invaders in JavaScript now
Source URL: https://simonwillison.net/2025/Jul/29/space-invaders/ Source: Simon Willison’s Weblog Title: My 2.5 year old laptop can write Space Invaders in JavaScript now Feedly Summary: I wrote about the new GLM-4.5 model family yesterday – new open weight (MIT licensed) models from Z.ai in China which their benchmarks claim score highly in coding even against models such as…
-
Simon Willison’s Weblog: GLM-4.5: Reasoning, Coding, and Agentic Abililties
Source URL: https://simonwillison.net/2025/Jul/28/glm-45/#atom-everything Source: Simon Willison’s Weblog Title: GLM-4.5: Reasoning, Coding, and Agentic Abililties Feedly Summary: GLM-4.5: Reasoning, Coding, and Agentic Abililties Another day, another significant new open weight model release from a Chinese frontier AI lab. This time it’s Z.ai – who rebranded (at least in English) from Zhipu AI a few months ago.…
-
Cloud Blog: BigQuery meets ADK & MCP: Accelerate agent development with BigQuery’s new first-party toolset
Source URL: https://cloud.google.com/blog/products/ai-machine-learning/bigquery-meets-google-adk-and-mcp/ Source: Cloud Blog Title: BigQuery meets ADK & MCP: Accelerate agent development with BigQuery’s new first-party toolset Feedly Summary: As the excitement around AI agents reaches enterprise customers, a critical question emerges: How can we empower these agents to securely and intelligently interact with enterprise data systems like Google Cloud BigQuery? Currently,…
-
Cloud Blog: Understanding Calendar mode for Dynamic Workload Scheduler: Reserve ML GPUs and TPUs
Source URL: https://cloud.google.com/blog/products/compute/dynamic-workload-scheduler-calendar-mode-reserves-gpus-and-tpus/ Source: Cloud Blog Title: Understanding Calendar mode for Dynamic Workload Scheduler: Reserve ML GPUs and TPUs Feedly Summary: Organizations need ML compute resources that can accommodate bursty peaks and periodic troughs. That means the consumption models for AI infrastructure need to evolve to be more cost-efficient, provide term flexibility, and support rapid…