Simon Willison’s Weblog: Kimi-K2-Instruct-0905

Source URL: https://simonwillison.net/2025/Sep/6/kimi-k2-instruct-0905/#atom-everything
Source: Simon Willison’s Weblog
Title: Kimi-K2-Instruct-0905

Feedly Summary: Kimi-K2-Instruct-0905
New not-quite-MIT licensed model from Chinese Moonshot AI, a follow-up to the highly regarded Kimi-K2 model they released in July.
This one is an incremental improvement – I’ve seen it referred to online as “Kimi K-2.1". It scores a little higher on a bunch of popular coding benchmarks, reflecting Moonshot’s claim that it "demonstrates significant improvements in performance on public benchmarks and real-world coding agent tasks".
More importantly the context window size has been increased from 128,000 to 256,000 tokens.
Like its predecessor this is a big model – 1 trillion parameters in a mixture-of-experts configuration with 384 experts, 32B activated parameters and 8 selected experts per token.
I used Groq’s playground tool to try "Generate an SVG of a pelican riding a bicycle" and got this result, at a very healthy 445 tokens/second taking just under 2 seconds total:

Tags: ai, generative-ai, llms, groq, pelican-riding-a-bicycle, llm-release, ai-in-china

AI Summary and Description: Yes

Summary: The text discusses the release of an upgraded AI model, Kimi-K2-Instruct-0905, by Moonshot AI, highlighting its enhanced coding performance and expanded context window size. This information is significant for professionals in AI and cloud computing, especially considering the advancements in large language models and their practical applications.

Detailed Description: The Kimi-K2-Instruct-0905 model marks a notable advancement in the landscape of AI and LLMs (large language models). Its release adds an incremental upgrade to the Kimi-K2 model that was previously well-received, transforming the competitive nature of generative AI applications.

Key Points:
– **Model Release**: Kimi-K2-Instruct-0905 is presented as a follow-up to the earlier Kimi-K2, which was noted for its performance.
– **Performance Enhancement**: The new model reportedly scores better on coding benchmarks, indicating improvements that can enhance the speed and accuracy of coding tasks for developers and AI-driven solutions.
– **Increased Context Window**: The context window for this model has been significantly increased from 128,000 to 256,000 tokens, allowing for broader and more complex task handling which is critical in applications that require context-rich interactions.
– **Model Specifications**:
– 1 trillion parameters with a mixture-of-experts configuration.
– 384 experts, with 32 billion activated parameters and 8 selected experts per token.
– **Practical Application**: An example task performed showcased the model’s speed and efficiency, generating a complex SVG image in under 2 seconds.

This upgrade is particularly relevant for developers utilizing generative AI and LLMs in their workflows, as it not only enhances the capabilities of AI-based coding assistants but also showcases the potential for AI to handle more intricate tasks while improving overall performance—a key consideration for businesses integrating AI into their operations.

Furthermore, the advancements presented by Moonshot AI’s new model could influence best practices in cloud computing security as organizations increasingly rely on AI for coding and automation tasks, emphasizing the importance of continuously monitoring and adjusting security protocols as AI technology evolves.