Simon Willison’s Weblog: deepseek-ai/DeepSeek-V3-0324

Mar 27, 2025

—

Source URL: https://simonwillison.net/2025/Mar/24/deepseek/
Source: Simon Willison’s Weblog
Title: deepseek-ai/DeepSeek-V3-0324

Feedly Summary: deepseek-ai/DeepSeek-V3-0324
Chinese AI lab DeepSeek just released the latest version of their enormous DeepSeek v3 model, baking the release date into the name DeepSeek-V3-0324.
The license is MIT, the README is empty and the release adds up a to a total of 641 GB of files, mostly of the form model-00035-of-000163.safetensors.
The model only came out a few hours ago and MLX developer Awni Hannun already has it running at >20 tokens/second on a 512GB M3 Ultra Mac Studio ($9,499 of ostensibly consumer-grade hardware) via mlx-lm and this mlx-community/DeepSeek-V3-0324-4bit 4bit quantization, which reduces the on-disk size to 352 GB.
I think that means if you have that machine you can run it with my llm-mlx plugin like this, but I’ve not tried myself!
llm mlx download-model mlx-community/DeepSeek-V3-0324-4bit
llm chat -m mlx-community/DeepSeek-V3-0324-4bit

In putting this post together I got Claude to build me this new tool for finding the total on-disk size of a Hugging Face repository, which is available in their API but not currently displayed on their website.
The new model is also listed on OpenRouter but I haven’t successfully run a prompt through it there yet – I currently get an error saying “No endpoints found matching your data policy".
Tags: llm-release, hugging-face, generative-ai, deepseek, ai, llms, mlx, llm, ai-assisted-programming, tools

AI Summary and Description: Yes

Summary: The release of DeepSeek v3, a large AI model from a Chinese lab, demonstrates the rapid advancements in generative AI. The model’s size, quantization capabilities, and the ability to run on consumer-grade hardware represent significant developments in the field. This release, along with insights into model management and API utilization, highlights practical applications and challenges that security and compliance professionals should consider when deploying AI tools.

Detailed Description:

– **Release Details:**
– The latest version of the DeepSeek AI model, known as DeepSeek-V3-0324, was released recently by a Chinese AI lab.
– This model is substantial, comprising 641 GB of files, predominantly formatted as model-00035-of-000163.safetensors.
– It’s released under the MIT license, which allows for a wide range of usage and distribution.

– **Performance and Hardware:**
– Developer Awni Hannun successfully ran the model at a speed of over 20 tokens per second on a Mac Studio equipped with a 512GB M3 Ultra chip.
– A notable feature is the 4-bit quantization, which reduces the model’s storage requirement to 352 GB, making it more accessible for users with high-performance consumer-grade hardware.

– **Implementation:**
– The text indicates how to download and run the model using the llm-mlx plugin, highlighting practical usage instructions.
– The author developed a tool utilizing Claude to identify the total on-disk size of Hugging Face repositories, showcasing improvements in model management and tool integration.

– **Challenges Faced:**
– The model is also available on OpenRouter, although the author encountered challenges running prompts due to a data policy error message, indicating potential issues in compliance and operational readiness when deploying new AI models.

– **Relevance to Security and Compliance:**
– As AI models grow in size and complexity, security professionals must be aware of data policies and compliance standards that regulate the use of such models.
– The open-source nature and the rapid development cycle of models like DeepSeek v3 raise questions about maintaining security, privacy, and regulatory compliance.

– **Implications for Professionals:**
– Understanding the capabilities and constraints of new AI models is essential for implementation in secure environments.
– Security teams need to establish governance frameworks that cover the use of AI models, especially in relation to data manipulation, model access, and operational deployments.

This release underlines the ongoing shift toward leveraging generative AI in varied applications and the importance of considering security and compliance in the AI lifecycle.

.NET 01 1 2 2025 24 3 4 5 a access Act advancement advancements AGI AI ai model AI models AI tool AI tools ai-assisted-programming alt and anti API app Application applications as assisted bit quantization by C capabilities challenges chat Chinese chip CIA Claude co community complexity compliance compliance professionals compliance standards consumer Current D data data manipulation data policy de deep DeepSeek Deepseek v3 demo deployment developer development developments e end endpoint endpoints environment error face feature for framework frameworks full g Gen generative Generative AI Go governance governance framework governance frameworks grade gs H hardware high high-performance Highlight HR http HTTPS hugging Hugging Face implementation implications in insights integration Iron ite J Just k l large led Li life llm llms lm long low M3 Ultra mac Mac Studio machine making man management manipulation matt ML mlx Mode model Model Access model management models my N no o of on only open open-source openrouter operation OPM ory out over performance play plugin point policies policy post potential practical applications pre privacy professionals programming prompt prompts Q quantization question R rag rate RCE readiness ready red regulatory regulatory compliance release repository Ro s safe safetensors sec secure secure environment secure environments security security and compliance security professionals security teams self shift side Sig Sim source SSE standards storage T Tags: Tails Teams test text the to token tokens tool tools Tor TP trie UI Ultra under up US usage use user Users uth utilization V V3 version Ware web website Wi x