Tag: Qwen
-
The Register: DeepSeek’s not the only Chinese LLM maker OpenAI and pals have to worry about. Right, Alibaba?
Source URL: https://www.theregister.com/2025/01/30/alibaba_qwen_ai/ Source: The Register Title: DeepSeek’s not the only Chinese LLM maker OpenAI and pals have to worry about. Right, Alibaba? Feedly Summary: Qwen 2.5 Max tops both DS V3 and GPT-4o, cloud giant claims Analysis The speed and efficiency at which DeepSeek claims to be training large language models (LLMs) competitive with…
-
Slashdot: After DeepSeek Shock, Alibaba Unveils Rival AI Model That Uses Less Computing Power
Source URL: https://slashdot.org/story/25/01/29/184223/after-deepseek-shock-alibaba-unveils-rival-ai-model-that-uses-less-computing-power?utm_source=rss1.0mainlinkanon&utm_medium=feed Source: Slashdot Title: After DeepSeek Shock, Alibaba Unveils Rival AI Model That Uses Less Computing Power Feedly Summary: AI Summary and Description: Yes Summary: Alibaba’s unveiling of the Qwen2.5-Max AI model highlights advancements in AI performance achieved through a more efficient architecture. This development is particularly relevant to AI security and infrastructure…
-
Hacker News: Qwen2.5-Max: Exploring the Intelligence of Large-Scale Moe Model
Source URL: https://qwenlm.github.io/blog/qwen2.5-max/ Source: Hacker News Title: Qwen2.5-Max: Exploring the Intelligence of Large-Scale Moe Model Feedly Summary: Comments AI Summary and Description: Yes Summary: The text discusses the development and performance evaluation of Qwen2.5-Max, a large-scale Mixture-of-Expert (MoE) model pretrained on over 20 trillion tokens. It highlights significant advancements in model intelligence achieved through scaling…
-
Simon Willison’s Weblog: Qwen2.5 VL! Qwen2.5 VL! Qwen2.5 VL!
Source URL: https://simonwillison.net/2025/Jan/27/qwen25-vl-qwen25-vl-qwen25-vl/ Source: Simon Willison’s Weblog Title: Qwen2.5 VL! Qwen2.5 VL! Qwen2.5 VL! Feedly Summary: Qwen2.5 VL! Qwen2.5 VL! Qwen2.5 VL! Hot on the heels of yesterday’s Qwen2.5-1M, here’s Qwen2.5 VL (with an excitable announcement title) – the latest in Qwen’s series of vision LLMs. They’re releasing multiple versions: base models and instruction tuned…
-
Hacker News: Qwen2.5-1M: Deploy Your Own Qwen with Context Length Up to 1M Tokens
Source URL: https://qwenlm.github.io/blog/qwen2.5-1m/ Source: Hacker News Title: Qwen2.5-1M: Deploy Your Own Qwen with Context Length Up to 1M Tokens Feedly Summary: Comments AI Summary and Description: Yes Summary: The text reports on the new release of the open-source Qwen2.5-1M models, capable of processing up to one million tokens, significantly improving inference speed and model performance…
-
Hacker News: Qwen2.5-7B-Instruct-1M and Qwen2.5-14B-Instruct-1M
Source URL: https://simonwillison.net/2025/Jan/26/qwen25-1m/ Source: Hacker News Title: Qwen2.5-7B-Instruct-1M and Qwen2.5-14B-Instruct-1M Feedly Summary: Comments AI Summary and Description: Yes Summary: The Qwen 2.5 model release from Alibaba introduces a significant advancement in Large Language Model (LLM) capabilities with its ability to process up to 1 million tokens. This increase in input capacity is made possible through…
-
Simon Willison’s Weblog: Qwen2.5-1M: Deploy Your Own Qwen with Context Length up to 1M Tokens
Source URL: https://simonwillison.net/2025/Jan/26/qwen25-1m/ Source: Simon Willison’s Weblog Title: Qwen2.5-1M: Deploy Your Own Qwen with Context Length up to 1M Tokens Feedly Summary: Qwen2.5-1M: Deploy Your Own Qwen with Context Length up to 1M Tokens Very significant new release from Alibaba’s Qwen team. Their openly licensed (sometimes Apache 2, sometimes Qwen license, I’ve had trouble keeping…
-
Hacker News: Official DeepSeek R1 Now on Ollama
Source URL: https://ollama.com/library/deepseek-r1 Source: Hacker News Title: Official DeepSeek R1 Now on Ollama Feedly Summary: Comments AI Summary and Description: Yes Summary: The text provides an overview of DeepSeek’s first-generation reasoning models that exhibit performance comparable to OpenAI’s offerings across math, code, and reasoning tasks. This information is highly relevant for practitioners in AI and…