Simon Willison’s Weblog: Two more Chinese pelicans

Oct 1, 2025

—

Source URL: https://simonwillison.net/2025/Oct/1/two-pelicans/#atom-everything
Source: Simon Willison’s Weblog
Title: Two more Chinese pelicans

Feedly Summary: Two new models from Chinese AI labs in the past few days. I tried them both out using llm-openrouter:
DeepSeek-V3.2-Exp from DeepSeek. Announcement, Tech Report, Hugging Face (690GB, MIT license).

As an intermediate step toward our next-generation architecture, V3.2-Exp builds upon V3.1-Terminus by introducing DeepSeek Sparse Attention—a sparse attention mechanism designed to explore and validate optimizations for training and inference efficiency in long-context scenarios.

This one felt very slow when I accessed it via OpenRouter – I probably got routed to one of the slower providers. Here’s the pelican:

GLM-4.6 from Z.ai. Announcement, Hugging Face (714GB, MIT license).

The context window has been expanded from 128K to 200K tokens […] higher scores on code benchmarks […] GLM-4.6 exhibits stronger performance in tool using and search-based agents.

Here’s the pelican for that:

Tags: llm, pelican-riding-a-bicycle, deepseek, ai-in-china, llms, llm-release, generative-ai, openrouter, ai

AI Summary and Description: Yes

Summary: The text discusses the release and features of two new AI models from Chinese labs, DeepSeek-V3.2-Exp and GLM-4.6, focusing on their architecture and performance enhancements. This is particularly relevant for professionals in AI and generative AI security domains, given the emphasis on model efficiency and capabilities.

Detailed Description: The content highlights two significant developments in the landscape of large language models (LLMs) from Chinese AI labs, pertinent to those in the fields of AI, cloud, and information security. Here’s a breakdown of the major points:

– **DeepSeek-V3.2-Exp**:
– Developed by DeepSeek, this model is an advancement over its predecessor, V3.1-Terminus.
– Introduces **DeepSeek Sparse Attention**, a novel mechanism aimed at improving both training and inference efficiency, especially for long-context applications.
– Such mechanisms can lead to better resource utilization and possibly shorter processing times, which may have implications for security in AI deployments—particularly in managing compute resources efficiently to prevent overuse or potential exploits.

– **GLM-4.6**:
– Released by Z.ai, the model features notable enhancements such as an increased context window, which has upgraded from 128K tokens to an impressive 200K tokens.
– Reports higher scores on code benchmarks, suggesting improvements in capability, utility for developers, and overall model performance in complex tasks.
– Stronger performance with tool-using and search-based agents can translate to advanced applications in automation and decision-making, bringing cybersecurity considerations into play when models interact with sensitive data or systems.

These advancements signal a continuous evolution in the AI and generative AI domains, prompting further investigation into their security implications.

* Implications for Security Professionals:
– Monitoring model performance for both security vulnerabilities and optimization strategies.
– Evaluating the impact of such rapidly evolving technologies on compliance with regulations and governance in AI deployment.
– Understanding the balance between performance enhancements and potential risks related to security exploits, especially with models capable of processing large amounts of data and context.

Overall, these developments are crucial as they lay the groundwork for the future of AI applications, necessitating a proactive approach to security and compliance in this evolving field.

.NET 1 2 2025 3 4 5 6 7 a access Act advanced advanced applications advancement advancements age agent agents AGI AI AI applications ai model AI models AI security All and API app Application applications Arch architecture art as at ated attention mechanism Auto automation based benchmark benchmarks Bi bicycle bot by C capabilities capability China Chinese Chinese labs CI CIA Cloud co code compliance compute compute resources content Context context window continuous continuous evolution core cyber cybersecurit Cybersecurity D data day days de decision decision-making deep DeepSeek deployment deployments design developer developers development developments domain domains e efficiency efficient event evolving technologies exp exploit exploits face feature features for future future of AI g Gen generation generative Generative AI Go governance grade gs H high Highlight http HTTPS hugging Hugging Face impact implications implications for security improving in Inference inference efficiency information information security inter investigation io iOS ite J k l Lance land language language model language models large large language model large language models Large Language Models (LLMs) Lead led Li license llm llms lm long low M making man media Mode model model efficiency model performance models Monitor monitoring N new next no o of on one ons open openrouter OPM opt optimization optimization strategies optimizations oS oss out over pelican per performance performance enhancement performance enhancements play point potential potential exploits potential risks pre pro proactive proactive approach process processing professionals prompt Prompting ps R rate RCE re red Regulation regulations release report reports resource resource utilization resources riding Risk risks Ro s search sec security security and compliance security considerations security implications security professionals Security Vulnerabilities sensitive data short side Sig Signal Sim Simon Willison source sparse attention SSE SSO STIG strategies system systems T Tags: Task tasks tech technologies ted text the Time times to token tokens tool Tor TP training training and inference trie two UI UN under up upgrade US use utilization V V3 val Valid vulnerabilities web Wi Wind x yt z