Simon Willison’s Weblog: Kimi-K2-Instruct-0905

Sep 6, 2025

—

Source URL: https://simonwillison.net/2025/Sep/6/kimi-k2-instruct-0905/#atom-everything
Source: Simon Willison’s Weblog
Title: Kimi-K2-Instruct-0905

Feedly Summary: Kimi-K2-Instruct-0905
New not-quite-MIT licensed model from Chinese Moonshot AI, a follow-up to the highly regarded Kimi-K2 model they released in July.
This one is an incremental improvement – I’ve seen it referred to online as “Kimi K-2.1". It scores a little higher on a bunch of popular coding benchmarks, reflecting Moonshot’s claim that it "demonstrates significant improvements in performance on public benchmarks and real-world coding agent tasks".
More importantly the context window size has been increased from 128,000 to 256,000 tokens.
Like its predecessor this is a big model – 1 trillion parameters in a mixture-of-experts configuration with 384 experts, 32B activated parameters and 8 selected experts per token.
I used Groq’s playground tool to try "Generate an SVG of a pelican riding a bicycle" and got this result, at a very healthy 445 tokens/second taking just under 2 seconds total:

Tags: ai, generative-ai, llms, groq, pelican-riding-a-bicycle, llm-release, ai-in-china

AI Summary and Description: Yes

Summary: The text discusses the release of an upgraded AI model, Kimi-K2-Instruct-0905, by Moonshot AI, highlighting its enhanced coding performance and expanded context window size. This information is significant for professionals in AI and cloud computing, especially considering the advancements in large language models and their practical applications.

Detailed Description: The Kimi-K2-Instruct-0905 model marks a notable advancement in the landscape of AI and LLMs (large language models). Its release adds an incremental upgrade to the Kimi-K2 model that was previously well-received, transforming the competitive nature of generative AI applications.

Key Points:
– **Model Release**: Kimi-K2-Instruct-0905 is presented as a follow-up to the earlier Kimi-K2, which was noted for its performance.
– **Performance Enhancement**: The new model reportedly scores better on coding benchmarks, indicating improvements that can enhance the speed and accuracy of coding tasks for developers and AI-driven solutions.
– **Increased Context Window**: The context window for this model has been significantly increased from 128,000 to 256,000 tokens, allowing for broader and more complex task handling which is critical in applications that require context-rich interactions.
– **Model Specifications**:
– 1 trillion parameters with a mixture-of-experts configuration.
– 384 experts, with 32 billion activated parameters and 8 selected experts per token.
– **Practical Application**: An example task performed showcased the model’s speed and efficiency, generating a complex SVG image in under 2 seconds.

This upgrade is particularly relevant for developers utilizing generative AI and LLMs in their workflows, as it not only enhances the capabilities of AI-based coding assistants but also showcases the potential for AI to handle more intricate tasks while improving overall performance—a key consideration for businesses integrating AI into their operations.

Furthermore, the advancements presented by Moonshot AI’s new model could influence best practices in cloud computing security as organizations increasingly rely on AI for coding and automation tasks, emphasizing the importance of continuously monitoring and adjusting security protocols as AI technology evolves.

.NET 1 2 2025 3 32B 4 5 5 model a accuracy Act actions advancement advancements age agent AI AI applications ai model AI technology All alt and app Application applications art as assistant assistants at ated Auto automation based based coding benchmark benchmarks Best best practices Bi bicycle business by C capabilities China Chinese CI CIA Cloud cloud computing cloud computing security co coding coding agent coding assistant coding assistants coding benchmarks coding performance coding tasks Col competitive Computing Configuration Context context window context window size continuous core critical D de demo developer developers drive driven driven solutions e efficiency exp expert Experts for g Gen generative Generative AI Go grade Groq gs H handling health high Highlight http HTTPS image improving in incremental improvement Influence information inter interaction interactions io ite J Just k Key l land language language model language models large large language model large language models led Li license line llm llms lm low M man Mixture mixture-of-experts Mode model model release model specifications models Monitor monitoring moonshot N new no o of on one only ons operation operations organization organizations over parameter pelican per performance performance enhancement play point potential practical application practical applications practices pre pro professionals protocol protocols ps public Q R rate RCE re real red release report riding Ro RoT s sec security security protocols shot side Sig Sim Simon Willison size solutions source specific speed SSE SSO SVG T Tags: taking Task tasks tech technology ted text the to token tokens tool Tor TP Transform trillion UI under up upgrade US use V web Well Wi Wind workflow workflows world x yt z