Source URL: https://simonwillison.net/2025/Mar/24/deepseek/
Source: Simon Willison’s Weblog
Title: deepseek-ai/DeepSeek-V3-0324
Feedly Summary: deepseek-ai/DeepSeek-V3-0324
Chinese AI lab DeepSeek just released the latest version of their enormous DeepSeek v3 model, baking the release date into the name DeepSeek-V3-0324.
The license is MIT, the README is empty and the release adds up a to a total of 641 GB of files, mostly of the form model-00035-of-000163.safetensors.
The model only came out a few hours ago and MLX developer Awni Hannun already has it running at >20 tokens/second on a 512GB M3 Ultra Mac Studio ($9,499 of ostensibly consumer-grade hardware) via mlx-lm and this mlx-community/DeepSeek-V3-0324-4bit 4bit quantization, which reduces the on-disk size to 352 GB.
I think that means if you have that machine you can run it with my llm-mlx plugin like this, but I’ve not tried myself!
llm mlx download-model mlx-community/DeepSeek-V3-0324-4bit
llm chat -m mlx-community/DeepSeek-V3-0324-4bit
In putting this post together I got Claude to build me this new tool for finding the total on-disk size of a Hugging Face repository, which is available in their API but not currently displayed on their website.
The new model is also listed on OpenRouter but I haven’t successfully run a prompt through it there yet – I currently get an error saying “No endpoints found matching your data policy".
Tags: llm-release, hugging-face, generative-ai, deepseek, ai, llms, mlx, llm, ai-assisted-programming, tools
AI Summary and Description: Yes
Summary: The release of DeepSeek v3, a large AI model from a Chinese lab, demonstrates the rapid advancements in generative AI. The model’s size, quantization capabilities, and the ability to run on consumer-grade hardware represent significant developments in the field. This release, along with insights into model management and API utilization, highlights practical applications and challenges that security and compliance professionals should consider when deploying AI tools.
Detailed Description:
– **Release Details:**
– The latest version of the DeepSeek AI model, known as DeepSeek-V3-0324, was released recently by a Chinese AI lab.
– This model is substantial, comprising 641 GB of files, predominantly formatted as model-00035-of-000163.safetensors.
– It’s released under the MIT license, which allows for a wide range of usage and distribution.
– **Performance and Hardware:**
– Developer Awni Hannun successfully ran the model at a speed of over 20 tokens per second on a Mac Studio equipped with a 512GB M3 Ultra chip.
– A notable feature is the 4-bit quantization, which reduces the model’s storage requirement to 352 GB, making it more accessible for users with high-performance consumer-grade hardware.
– **Implementation:**
– The text indicates how to download and run the model using the llm-mlx plugin, highlighting practical usage instructions.
– The author developed a tool utilizing Claude to identify the total on-disk size of Hugging Face repositories, showcasing improvements in model management and tool integration.
– **Challenges Faced:**
– The model is also available on OpenRouter, although the author encountered challenges running prompts due to a data policy error message, indicating potential issues in compliance and operational readiness when deploying new AI models.
– **Relevance to Security and Compliance:**
– As AI models grow in size and complexity, security professionals must be aware of data policies and compliance standards that regulate the use of such models.
– The open-source nature and the rapid development cycle of models like DeepSeek v3 raise questions about maintaining security, privacy, and regulatory compliance.
– **Implications for Professionals:**
– Understanding the capabilities and constraints of new AI models is essential for implementation in secure environments.
– Security teams need to establish governance frameworks that cover the use of AI models, especially in relation to data manipulation, model access, and operational deployments.
This release underlines the ongoing shift toward leveraging generative AI in varied applications and the importance of considering security and compliance in the AI lifecycle.