Tag: Qwen
-
Simon Willison’s Weblog: Qwen3-Coder: Agentic Coding in the World
Source URL: https://simonwillison.net/2025/Jul/22/qwen3-coder/ Source: Simon Willison’s Weblog Title: Qwen3-Coder: Agentic Coding in the World Feedly Summary: Qwen3-Coder: Agentic Coding in the World It turns out that as I was typing up my notes on Qwen3-235B-A22B-Instruct-2507 the Qwen team were unleashing something much bigger: Today, we’re announcing Qwen3-Coder, our most agentic code model to date. Qwen3-Coder…
-
Simon Willison’s Weblog: Qwen/Qwen3-235B-A22B-Instruct-2507
Source URL: https://simonwillison.net/2025/Jul/22/qwen3-235b-a22b-instruct-2507/#atom-everything Source: Simon Willison’s Weblog Title: Qwen/Qwen3-235B-A22B-Instruct-2507 Feedly Summary: Qwen/Qwen3-235B-A22B-Instruct-2507 Significant new model release from Qwen, published yesterday without much fanfare. This is a follow-up to their April release of the full Qwen 3 model family, which included a Qwen3-235B-A22B model which could handle both reasoning and non-reasoning prompts (via a /no_think toggle).…
-
Cloud Blog: Implementing High-Performance LLM Serving on GKE: An Inference Gateway Walkthrough
Source URL: https://cloud.google.com/blog/topics/developers-practitioners/implementing-high-performance-llm-serving-on-gke-an-inference-gateway-walkthrough/ Source: Cloud Blog Title: Implementing High-Performance LLM Serving on GKE: An Inference Gateway Walkthrough Feedly Summary: The excitement around open Large Language Models like Gemma, Llama, Mistral, and Qwen is evident, but developers quickly hit a wall. How do you deploy them effectively at scale? Traditional load balancing algorithms fall short, as…
-
Slashdot: Simple Text Additions Can Fool Advanced AI Reasoning Models, Researchers Find
Source URL: https://tech.slashdot.org/story/25/07/04/1521245/simple-text-additions-can-fool-advanced-ai-reasoning-models-researchers-find Source: Slashdot Title: Simple Text Additions Can Fool Advanced AI Reasoning Models, Researchers Find Feedly Summary: AI Summary and Description: Yes Summary: The research highlights a significant vulnerability in state-of-the-art reasoning AI models through the “CatAttack” technique, which attaches irrelevant phrases to math problems, leading to higher error rates and inefficient responses.…
-
Docker: Tool Calling with Local LLMs: A Practical Evaluation
Source URL: https://www.docker.com/blog/local-llm-tool-calling-a-practical-evaluation/ Source: Docker Title: Tool Calling with Local LLMs: A Practical Evaluation Feedly Summary: Which local model should I use for tool calling? When building GenAI and agentic applications, one of the most pressing and persistent questions is: “Which local model should I use for tool calling?” We kept hearing again and again,…
-
Simon Willison’s Weblog: AbsenceBench: Language Models Can’t Tell What’s Missing
Source URL: https://simonwillison.net/2025/Jun/20/absencebench/#atom-everything Source: Simon Willison’s Weblog Title: AbsenceBench: Language Models Can’t Tell What’s Missing Feedly Summary: AbsenceBench: Language Models Can’t Tell What’s Missing Here’s another interesting result to file under the “jagged frontier" of LLMs, where their strengths and weaknesses are often unintuitive. Long context models have been getting increasingly good at passing "Needle…
-
Simon Willison’s Weblog: Qwen3 Embedding
Source URL: https://simonwillison.net/2025/Jun/8/qwen3-embedding/#atom-everything Source: Simon Willison’s Weblog Title: Qwen3 Embedding Feedly Summary: Qwen3 Embedding New family of embedding models from Qwen, in three sizes: 0.6B, 4B, 8B – and two categories: Text Embedding and Text Reranking. The full collection can be browsed on Hugging Face. The smallest available model is the 0.6B Q8 one, which…
-
Cloud Blog: Building a Production Multimodal Fine-Tuning Pipeline
Source URL: https://cloud.google.com/blog/topics/developers-practitioners/building-a-production-multimodal-fine-tuning-pipeline/ Source: Cloud Blog Title: Building a Production Multimodal Fine-Tuning Pipeline Feedly Summary: Looking to fine-tune multimodal AI models for your specific domain but facing infrastructure and implementation challenges? This guide demonstrates how to overcome the multimodal implementation gap using Google Cloud and Axolotl, with a complete hands-on example fine-tuning Gemma 3 on…
-
Simon Willison’s Weblog: Shisa V2 405B: Japan’s Highest Performing LLM
Source URL: https://simonwillison.net/2025/Jun/3/shisa-v2/ Source: Simon Willison’s Weblog Title: Shisa V2 405B: Japan’s Highest Performing LLM Feedly Summary: Shisa V2 405B: Japan’s Highest Performing LLM Leonard Lin and Adam Lensenmayer have been working on Shisa for a while. They describe their latest release as “Japan’s Highest Performing LLM". Shisa V2 405B is the highest-performing LLM ever…
-
Simon Willison’s Weblog: deepseek-ai/DeepSeek-R1-0528
Source URL: https://simonwillison.net/2025/May/31/deepseek-aideepseek-r1-0528/ Source: Simon Willison’s Weblog Title: deepseek-ai/DeepSeek-R1-0528 Feedly Summary: deepseek-ai/DeepSeek-R1-0528 Sadly the trend for terrible naming of models has infested the Chinese AI labs as well. DeepSeek-R1-0528 is a brand new and much improved open weights reasoning model from DeepSeek, a major step up from the DeepSeek R1 they released back in January.…