Tag: DeepSeek

Source URL: https://simonwillison.net/2025/Oct/1/two-pelicans/#atom-everything Source: Simon Willison’s Weblog Title: Two more Chinese pelicans Feedly Summary: Two new models from Chinese AI labs in the past few days. I tried them both out using llm-openrouter: DeepSeek-V3.2-Exp from DeepSeek. Announcement, Tech Report, Hugging Face (690GB, MIT license). As an intermediate step toward our next-generation architecture, V3.2-Exp builds upon…

Simon Willison’s Weblog: CompileBench: Can AI Compile 22-year-old Code?

Sep 22, 2025

—

by

Source URL: https://simonwillison.net/2025/Sep/22/compilebench/ Source: Simon Willison’s Weblog Title: CompileBench: Can AI Compile 22-year-old Code? Feedly Summary: CompileBench: Can AI Compile 22-year-old Code? Interesting new LLM benchmark from Piotr Grabowski and Piotr Migdał: how well can different models handle compilation challenges such as cross-compiling gucr for ARM64 architecture? This is one of my favorite applications of…

The Register: Sorry, but DeepSeek didn’t really train its flagship model for $294,000

Sep 19, 2025

—

by

Source URL: https://www.theregister.com/2025/09/19/deepseek_cost_train/ Source: The Register Title: Sorry, but DeepSeek didn’t really train its flagship model for $294,000 Feedly Summary: Training costs detailed in R1 training report don’t include 2.79 million GPU hours that laid its foundation Chinese AI darling DeepSeek’s now infamous R1 research report was published in the Journal Nature this week, alongside…

AWS News Blog: DeepSeek-V3.1 model now available in Amazon Bedrock

—

by

Source URL: https://aws.amazon.com/blogs/aws/deepseek-v3-1-now-available-in-amazon-bedrock/ Source: AWS News Blog Title: DeepSeek-V3.1 model now available in Amazon Bedrock Feedly Summary: AWS launches DeepSeek-V3.1 as a fully managed models in Amazon Bedrock. DeepSeek-V3.1 is a hybrid open weight model that switches between thinking mode for detailed step-by-step analysis and non-thinking mode for faster responses. AI Summary and Description: Yes…

Slashdot: China’s DeepSeek Says Its Hit AI Model Cost Just $294,000 To Train

—

by

Source URL: https://slashdot.org/story/25/09/18/1315238/chinas-deepseek-says-its-hit-ai-model-cost-just-294000-to-train?utm_source=rss1.0mainlinkanon&utm_medium=feed Source: Slashdot Title: China’s DeepSeek Says Its Hit AI Model Cost Just $294,000 To Train Feedly Summary: AI Summary and Description: Yes Summary: The text discusses the cost of training the R1 AI model by Chinese developer DeepSeek, which at $294,000 is significantly lower than costs cited by U.S. competitors. This data,…

The Register: China’s DeepSeek applying trial-and-error learning to its AI ‘reasoning’

—

by

Source URL: https://www.theregister.com/2025/09/18/chinas_deepseek_ai_reasoning_research/ Source: The Register Title: China’s DeepSeek applying trial-and-error learning to its AI ‘reasoning’ Feedly Summary: Model can also explain its answers, researchers find Chinese AI company DeepSeek has shown it can improve the reasoning of its LLM DeepSeek-R1 through trial-and-error based reinforcement learning, and even be made to explain its reasoning on…

Slashdot: DeepSeek Writes Less-Secure Code For Groups China Disfavors

—

by

Source URL: https://slashdot.org/story/25/09/17/2123211/deepseek-writes-less-secure-code-for-groups-china-disfavors?utm_source=rss1.0mainlinkanon&utm_medium=feed Source: Slashdot Title: DeepSeek Writes Less-Secure Code For Groups China Disfavors Feedly Summary: AI Summary and Description: Yes Summary: The research by CrowdStrike reveals that DeepSeek, a leading AI firm in China, provides lower-quality and less secure code for requests linked to certain politically sensitive groups, highlighting the intersection of AI technology…

Slashdot: UAE Lab Releases Open-Source Model to Rival China’s DeepSeek

Sep 13, 2025

—

by

Source URL: https://slashdot.org/story/25/09/13/1734225/uae-lab-releases-open-source-model-to-rival-chinas-deepseek Source: Slashdot Title: UAE Lab Releases Open-Source Model to Rival China’s DeepSeek Feedly Summary: AI Summary and Description: Yes Summary: The United Arab Emirates is making significant advancements in the AI arena, exemplified by the release of the K2 Think model from the Institute of Foundation Models. This open-source model, which reportedly…

Cloud Blog: How Baseten achieves 225% better cost-performance for AI inference (and you can too)

Sep 4, 2025

—

by