Simon Willison’s Weblog: What people get wrong about the leading Chinese open models: Adoption and censorship

May 6, 2025

—

Source URL: https://simonwillison.net/2025/May/6/what-people-get-wrong-about-the-leading-chinese-models/#atom-everything
Source: Simon Willison’s Weblog
Title: What people get wrong about the leading Chinese open models: Adoption and censorship

Feedly Summary: What people get wrong about the leading Chinese open models: Adoption and censorship
While I’ve been enjoying trying out Alibaba’s Qwen 3 a lot recently, Nathan Lambert focuses on the elephant in the room:

People vastly underestimate the number of companies that cannot use Qwen and DeepSeek open models because they come from China. This includes on-premise solutions built by people who know the fact that model weights alone cannot reveal anything to their creators.

The root problem here is the closed nature of the training data. Even if a model is open weights, it’s not possible to conclusively determine that it couldn’t add backdoors to generated code or trigger “indirect influence of Chinese values on Western business systems". Qwen 3 certainly has baked in opinions about the status of Taiwan!
Nathan sees this as an opportunity for other liberally licensed models, including his own team’s OLMo:

This gap provides a big opportunity for Western AI labs to lead in open models. Without DeepSeek and Qwen, the top tier of models we’re left with are Llama and Gemma, which both have very restrictive licenses when compared to their Chinese counterparts. These licenses are proportionally likely to block an IT department from approving a model.
This takes us to the middle tier of permissively licensed, open weight models who actually have a huge opportunity ahead of them: OLMo, of course, I’m biased, Microsoft with Phi, Mistral, IBM (!??!), and some other smaller companies to fill out the long tail.

Via @natolambert
Tags: ai-ethics, generative-ai, ai, qwen, llms, open-source

AI Summary and Description: Yes

Summary: The text discusses the challenges and misconceptions surrounding the adoption of leading Chinese open AI models, particularly in the context of censorship and potential risks associated with their use outside China. It highlights a gap in the market that Western AI labs could exploit by providing more permissively licensed models.

Detailed Description: The text presents a critical examination of the landscape of open AI models, particularly those originating from China, such as Alibaba’s Qwen and DeepSeek. The author emphasizes several key points:

– **Adoption Barriers**: Companies face significant barriers to using Chinese models like Qwen and DeepSeek due to concerns over their origin and potential censorship issues. This limits the accessibility of these models for global enterprises.

– **Concerns Over Censorship and Bias**: Even with open model weights, the opaque nature of the training data raises alarms about embedded biases or backdoors that could manipulate outputs, reflecting “Chinese values” and influencing Western business practices.

– **Opportunities for Western AI Labs**: The hurdles presented by Chinese models create a clear opportunity for Western developers to innovate with open models that have fewer restrictions. This includes mention of models like OLMo and others from well-known companies such as Microsoft, Mistral, and IBM.

– **Competitive Landscape**: The text contrasts the restrictive licensing of top-tier models such as Llama and Gemma with the more inclusive licensing options available among models emerging from Western labs, suggesting that this could influence IT departments’ decisions on model approval.

– **Market Dynamics**: The shift in market dynamics due to concerns over censorship and trust in AI outputs points to a potential re-alignment in where innovation occurs within the AI landscape, particularly for organizations seeking transparency and ethical governance in AI usage.

Key Implications for Security and Compliance Professionals:
– Awareness of the geopolitical implications of adopting AI technologies is essential for organizations, particularly when sourcing AI solutions from regions with different regulatory and ideological frameworks.
– The potential for embedded biases in AI models can pose significant risks to compliance and ethical standards, demanding thorough vetting of any AI solutions implemented in business processes.
– There is a growing need for transparency in AI systems, which should be factored into organizations’ AI governance frameworks, emphasizing the importance of verifying not just the technology but the intentions behind its creators.

This discussion is relevant to various sectors, including those focused on AI security, compliance frameworks, and the broader implications of AI technology adoption in diverse geographical contexts.

.NET 2 2025 3 5 a access accessibility Act adoption AI AI governance AI landscape ai model AI models AI security AI systems AI technologies AI technology Alibaba alignment and app ARM art as aware awareness backdoor backdoors bert Bi bias biases built business business processes by C censorship censorship issues CERN challenges China Chinese CI CIA CleaR closed co code companies competitive competitive landscape compliance compliance framework compliance frameworks compliance professionals concept concerns Context critical D data de decision decisions deep DeepSeek demand developer developers e emerging enterprise enterprises EoL ERP ethical ethical governance ethical standards Ethics exp exploit face fact focused for framework frameworks g Gemma Gen generated generated code generative geo geopolitical geopolitical implications Go governance governance framework governance frameworks graph gs H high Highlight http HTTPS IBM implications in Influence innovation intent issue J Just k Key l land led Li licensing llama llm llms lm logic long M man market market dynamics Micro Microsoft middle misconceptions Mistral Mode model model weights models N nation no o of on one open open models open weight models open weights open-source opt options organization organizations ory out output Outputs over phi point potential potential risks pre problem process processes professionals Q Qwen R rate RCE red Region Regions regulatory Risk risks Ro Root Rust s sec sector security security and compliance shift side Sig Sim size SoC solutions source sourcing SSE SSO standards system systems T Tags: Taiwan team tech technologies technology technology adoption text the to Tor TP training training data transparency trust trust in AI UI under US usage use uth V val WAN Ware web weight models Well western labs Wi x