Tag: Qwen
-
Slashdot: Alibaba’s ZeroSearch Teaches AI To Search Without Search Engines, Cuts Training Costs By 88%
Source URL: https://slashdot.org/story/25/05/09/0113217/alibabas-zerosearch-teaches-ai-to-search-without-search-engines-cuts-training-costs-by-88?utm_source=rss1.0mainlinkanon&utm_medium=feed Source: Slashdot Title: Alibaba’s ZeroSearch Teaches AI To Search Without Search Engines, Cuts Training Costs By 88% Feedly Summary: AI Summary and Description: Yes Summary: Alibaba Group’s “ZeroSearch” technique showcases an innovative approach that enables large language models (LLMs) to develop search capabilities without relying on external search engines, demonstrating significant cost…
-
Simon Willison’s Weblog: Saying "hi" to Microsoft’s Phi-4-reasoning
Source URL: https://simonwillison.net/2025/May/6/phi-4-reasoning/#atom-everything Source: Simon Willison’s Weblog Title: Saying "hi" to Microsoft’s Phi-4-reasoning Feedly Summary: Microsoft released a new sub-family of models a few days ago: Phi-4 reasoning. They introduced them in this blog post celebrating a year since the release of Phi-3: Today, we are excited to introduce Phi-4-reasoning, Phi-4-reasoning-plus, and Phi-4-mini-reasoning – marking…
-
Simon Willison’s Weblog: What people get wrong about the leading Chinese open models: Adoption and censorship
Source URL: https://simonwillison.net/2025/May/6/what-people-get-wrong-about-the-leading-chinese-models/#atom-everything Source: Simon Willison’s Weblog Title: What people get wrong about the leading Chinese open models: Adoption and censorship Feedly Summary: What people get wrong about the leading Chinese open models: Adoption and censorship While I’ve been enjoying trying out Alibaba’s Qwen 3 a lot recently, Nathan Lambert focuses on the elephant in…
-
Simon Willison’s Weblog: Qwen3-8B
Source URL: https://simonwillison.net/2025/May/2/qwen3-8b/#atom-everything Source: Simon Willison’s Weblog Title: Qwen3-8B Feedly Summary: Having tried a few of the Qwen 3 models now my favorite is a bit of a surprise to me: I’m really enjoying Qwen3-8B. I’ve been running prompts through the MLX 4bit quantized version, mlx-community/Qwen3-8B-4bit. I’m using llm-mlx like this: llm install llm-mlx llm…
-
Simon Willison’s Weblog: Qwen 3 offers a case study in how to effectively release a model
Source URL: https://simonwillison.net/2025/Apr/29/qwen-3/ Source: Simon Willison’s Weblog Title: Qwen 3 offers a case study in how to effectively release a model Feedly Summary: Alibaba’s Qwen team released the hotly anticipated Qwen 3 model family today. The Qwen models are already some of the best open weight models – Apache 2.0 licensed and with a variety…
-
Simon Willison’s Weblog: Qwen2.5 Omni: See, Hear, Talk, Write, Do It All!
Source URL: https://simonwillison.net/2025/Apr/28/qwen25-omni/#atom-everything Source: Simon Willison’s Weblog Title: Qwen2.5 Omni: See, Hear, Talk, Write, Do It All! Feedly Summary: Qwen2.5 Omni: See, Hear, Talk, Write, Do It All! I’m not sure how I missed this one at the time, but last month (March 27th) Qwen released their first multi-modal model that can handle audio and…
-
Simon Willison’s Weblog: Note on 20th April 2025
Source URL: https://simonwillison.net/2025/Apr/20/janky-license/#atom-everything Source: Simon Willison’s Weblog Title: Note on 20th April 2025 Feedly Summary: Now that Llama has very real competition in open weight models (Gemma 3, latest Mistrals, DeepSeek, Qwen) I think their janky license is becoming much more of a liability for them. It’s just limiting enough that it could be the…
-
Simon Willison’s Weblog: An LLM Query Understanding Service
Source URL: https://simonwillison.net/2025/Apr/9/an-llm-query-understanding-service/#atom-everything Source: Simon Willison’s Weblog Title: An LLM Query Understanding Service Feedly Summary: An LLM Query Understanding Service Doug Turnbull recently wrote about how all search is structured now: Many times, even a small open source LLM will be able to turn a search query into reasonable structure at relatively low cost. In…
-
Simon Willison’s Weblog: Mistral Small 3.1 on Ollama
Source URL: https://simonwillison.net/2025/Apr/8/mistral-small-31-on-ollama/#atom-everything Source: Simon Willison’s Weblog Title: Mistral Small 3.1 on Ollama Feedly Summary: Mistral Small 3.1 on Ollama Mistral Small 3.1 (previously) is now available through Ollama, providing an easy way to run this multi-modal (vision) model on a Mac (and other platforms, though I haven’t tried them myself yet). I had to…
-
Simon Willison’s Weblog: Long context support in LLM 0.24 using fragments and template plugins
Source URL: https://simonwillison.net/2025/Apr/7/long-context-llm/#atom-everything Source: Simon Willison’s Weblog Title: Long context support in LLM 0.24 using fragments and template plugins Feedly Summary: LLM 0.24 is now available with new features to help take advantage of the increasingly long input context supported by modern LLMs. (LLM is my command-line tool and Python library for interacting with LLMs,…