Simon Willison’s Weblog: moonshotai/Kimi-K2-Instruct

Jul 11, 2025

—

Source URL: https://simonwillison.net/2025/Jul/11/kimi-k2/#atom-everything
Source: Simon Willison’s Weblog
Title: moonshotai/Kimi-K2-Instruct

Feedly Summary: moonshotai/Kimi-K2-Instruct
Colossal new open weights model release today from Moonshot AI, a two year old Chinese AI lab with a name inspired by Pink Floyd’s album The Dark Side of the Moon.
My HuggingFace storage calculator says the repository is 958.52 GB. It’s a mixture-of-experts model with “32 billion activated parameters and 1 trillion total parameters", trained using the Muon optimizer as described in Moonshot’s joint paper with UCLA Muon is Scalable for LLM Training.
I think this may be the largest ever open weights model? DeepSeek v3 is 671B.
I created an API key for Moonshot, added some dollars and ran a prompt against it using my LLM tool. First I added this to the extra-openai-models.yaml file:
– model_id: kimi-k2
model_name: kimi-k2-0711-preview
api_base: https://api.moonshot.ai/v1
api_key_name: moonshot

Then I set the API key:
llm keys set moonshot
# Paste key here

And ran a prompt:
llm -m kimi-k2 "Generate an SVG of a pelican riding a bicycle" \
-o max_tokens 2000

(The default max tokens setting was too short.)

This is pretty good! The spokes are a nice touch. Full transcript here.
This one is open weights but not open source: they’re using a modified MIT license with this non-OSI-compliant section tagged on at the end:

Our only modification part is that, if the Software (or any derivative works
thereof) is used for any of your commercial products or services that have
more than 100 million monthly active users, or more than 20 million US dollars
(or equivalent in other currencies) in monthly revenue, you shall prominently
display "Kimi K2" on the user interface of such product or service.

Via Hacker News
Tags: ai, generative-ai, llms, llm, pelican-riding-a-bicycle, llm-release

AI Summary and Description: Yes

Summary: The text discusses the release of a new large model named Kimi-K2 from Moonshot AI, highlighting its scale, architecture, and licensing terms. This entry marks a significant development in the generative AI space, especially given the model’s vast number of parameters and its implications for API usage.

Detailed Description: The passage provides valuable insights for professionals interested in generative AI, large language models (LLMs), and the implications of licensing in AI technologies.

* **Model Overview**:
– **Developer**: Moonshot AI, a newer AI laboratory based in China.
– **Model Name**: Kimi-K2.
– **Size and Architecture**:
– The model boasts “32 billion activated parameters and 1 trillion total parameters,” making it one of the largest open weights models released.
– It is a mixture-of-experts model, indicating a sophisticated architecture for enhanced performance.
– Utilizes the Muon optimizer, a technique highlighted in collaborative research with UCLA.

* **Licensing Terms**:
– Although classified as “open weights,” the model is not open source.
– It operates under a modified MIT license with specific commercial constraints:
– If used in commercial products with over 100 million monthly active users or 20 million USD in monthly revenue, explicit attribution to “Kimi K2” must be displayed.

* **Practical Application**:
– The text illustrates how to interact with this model through an API, providing clear steps on integrating it with a user’s tools (e.g., a YAML configuration for API access).
– An example prompt showcases the model’s capabilities, indicating the potential for generating complex requests, such as the generation of SVG art.

* **Significance for Security and Compliance**:
– The licensing conditions raise important considerations for organizations looking to integrate advanced AI models. Compliance with the stipulated conditions is critical, especially for entities at scale.
– The scale of the model also poses security implications regarding data handling, including the safeguarding of API keys and the management of potentially large amounts of generated content.

This release serves as a notable milestone in the AI landscape, emphasizing both technical depth and practicality while shedding light on the evolving framework of AI licensing and its implications for the industry.

.NET 1 10 2 2025 3 5 7 a access Act advanced advanced AI AI AI landscape ai model AI models AI technologies alt and API API keys app Application Arch architecture art as at ated attribution based Bi bicycle by C capabilities China Chinese CI CIA class CleaR co Col collaborative collaborative research commercial compliance Condi Configuration content critical D data Data Handling day de deep DeepSeek Deepseek v3 depth developer development e end enhanced performance Entry exp expert Experts face fault file first for framework full g Gen generated Generated Content generation generative Generative AI Go gs H hack hacker Hacker News handling high Highlight HR http HTTPS hugging Huggingface implications in industry insights inter interface io IRS ite J k Key keys l Labor land language language model language models large large language model large language models Large Language Models (LLMs) led Li license licensing licensing terms llm llms lm M making man management max milestone Mixture mixture-of-experts ML Mode model models ModI moonshot my N new news no non o of on one only open open weights open weights models openai OPM opt optimizer organization organizations ory oS other over paper parameter pelican per performance phi play potential pre Preview pro product products professionals prompt ps Q R rag Raise rate RCE red release repository research revenue review riding Ro s safe scalable Scale SD search sec security security and compliance security implications service services SHA short shot side Sig Sim size software source specific SSE storage SVG T Tags: tech technologies ted text the to token tokens tool tools Tor TP trained training trillion two UI under US usage use user user interface Users V V3 val Ware web weight weights models Wi x yaml yt z