Source URL: https://simonwillison.net/2025/Sep/23/qwen3-vl/
Source: Simon Willison’s Weblog
Title: Qwen3-VL: Sharper Vision, Deeper Thought, Broader Action
Feedly Summary: Qwen3-VL: Sharper Vision, Deeper Thought, Broader Action
I’ve been looking forward to this. Qwen 2.5 VL is one of the best available open weight vision LLMs, so I had high hopes for Qwen 3’s vision models.
Firstly, we are open-sourcing the flagship model of this series: Qwen3-VL-235B-A22B, available in both Instruct and Thinking versions. The Instruct version matches or even exceeds Gemini 2.5 Pro in major visual perception benchmarks. The Thinking version achieves state-of-the-art results across many multimodal reasoning benchmarks.
Bold claims against Gemini 2.5 Pro, which are supported by a flurry of self-reported benchmarks.
This initial model is enormous. On Hugging Face both Qwen3-VL-235B-A22B-Instruct and Qwen3-VL-235B-A22B-Thinking are 235B parameters and weigh 471 GB. Not something I’m going to be able to run on my 64GB Mac!
The Qwen 2.5 VL family included models at 72B, 32B, 7B and 3B sizes. Given the rate Qwen are shipping models at the moment I wouldn’t be surprised to see smaller Qwen 3 VL models show up in just the next few days.
Also from Qwen today, three new API-only closed-weight models: upgraded Qwen 3 Coder, Qwen3-LiveTranslate-Flash (real-time multimodal interpretation), and Qwen3-Max, their new trillion parameter flagship model, which they describe as their “largest and most capable model to date".
Via Hacker News
Tags: ai, generative-ai, llms, vision-llms, qwen, llm-reasoning, llm-release, ai-in-china
AI Summary and Description: Yes
Summary: The text discusses the release of Qwen3-VL, an advanced open-source vision LLM from Qwen, highlighting its superior performance against competitors and massive model size. This information is particularly significant for professionals in AI and infrastructure security, as it reflects advancements in generative AI and its implications for model deployment and security considerations.
Detailed Description:
The content focuses on the introduction of Qwen3-VL, a powerful vision Large Language Model (LLM) designed for enhanced visual perception and multimodal reasoning. Key highlights include:
– **Model Variants**: Qwen 3 comes in both Instruct and Thinking versions, with the Instruct version reportedly outperforming Gemini 2.5 Pro in visual perception benchmarks.
– **Benchmark Performance**: Each model’s capabilities are backed by self-reported benchmarks, underscoring the competitive edge of Qwen’s implementations in the ever-evolving AI landscape.
– **Model Size**: The flagship Qwen3-VL-235B-A22B model consists of 235 billion parameters and takes up 471 GB of storage. This scale presents a considerable challenge for deployment on standard hardware, such as a 64GB Mac.
– **Future Developments**: The text hints at the potential release of smaller variants of the Qwen 3 VL lineup, reflecting the rapid pace of innovation within the AI field.
– **Additional Releases**: Qwen has introduced new API-only models, including Qwen 3 Coder and Qwen3-LiveTranslate-Flash, which facilitate advanced real-time multimodal tasks.
Considering these advancements, several implications arise for security and compliance professionals:
– **Security in AI Models**: The deployment of large models requires rigorous compliance with security protocols, especially as they may be used in sensitive applications that involve personal or proprietary data.
– **Infrastructure Readiness**: Organizations must assess their infrastructure capabilities to handle such expansive models, as inadequate resources could lead to security vulnerabilities during deployment.
– **Governance and Compliance**: The advancements in models like Qwen3-VL increase the need for clear governance frameworks to ensure ethical use and compliance with regulatory standards in AI applications.
– **Industry Competitiveness**: The announcement signifies ongoing competition in the AI field, requiring professionals to stay abreast of technological developments to maintain organizational competitiveness.
The evolution of generative AI models like Qwen3-VL emphasizes the critical need for integrated security strategies as organizations leverage these technologies for both operational efficiency and regulatory compliance.