Simon Willison’s Weblog: My 2.5 year old laptop can write Space Invaders in JavaScript now

Jul 29, 2025

—

Source URL: https://simonwillison.net/2025/Jul/29/space-invaders/
Source: Simon Willison’s Weblog
Title: My 2.5 year old laptop can write Space Invaders in JavaScript now

Feedly Summary: I wrote about the new GLM-4.5 model family yesterday – new open weight (MIT licensed) models from Z.ai in China which their benchmarks claim score highly in coding even against models such as Claude Sonnet 4.
The models are pretty big – the smaller GLM-4.5 Air model is still 106 billion total parameters, which is 205.78GB on Hugging Face.
Ivan Fioravanti built this 44GB 3bit quantized version for MLX, specifically sized so people with 64GB machines could have a chance of running it. I tried it out… and it works extremely well.
I fed it the following prompt:
Write an HTML and JavaScript page implementing space invaders
And it churned away for a while and produced the following:

Clearly this isn’t a particularly novel example, but I still think it’s noteworthy that a model running on my 2.5 year old laptop (a 64GB MacBook Pro M2) is able to produce code like this – especially code that worked first time with no further edits needed.
How I ran the model
I had to run it using the current main branch of the mlx-lm library (to ensure I had this commit adding glm4_moe support). I ran that using uv like this:
uv run \
–with ‘https://github.com/ml-explore/mlx-lm/archive/489e63376b963ac02b3b7223f778dbecc164716b.zip’ \
python
Then in that Python interpreter I used the standard recipe for running MLX models:
from mlx_lm import load, generate
model, tokenizer = load(“mlx-community/GLM-4.5-Air-3bit")
That downloaded 44GB of model weights to my ~/.cache/huggingface/hub/models–mlx-community–GLM-4.5-Air-3bit folder.
Then:
prompt = "Write an HTML and JavaScript page implementing space invaders"
messages = [{"role": "user", "content": prompt}]
prompt = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True
)
response = generate(
model, tokenizer,
prompt=prompt,
verbose=True,
max_tokens=8192
)
The response started like this:

The user wants me to create a Space Invaders game using HTML, CSS, and JavaScript. I need to create a complete, functional game with the following features:

Player spaceship that can move left/right and shoot
Enemy invaders that move in formation and shoot back
Score tracking
Lives/health system
Game over conditions […]

Followed by the HTML and this debugging output:
Prompt: 14 tokens, 14.095 tokens-per-sec
Generation: 4193 tokens, 25.564 tokens-per-sec
Peak memory: 47.687 GB

You can see the full transcript here, or view the source on GitHub, or try it out in your browser.
A pelican for good measure
I ran my pelican benchmark against the full sized models yesterday, but I couldn’t resist trying it against this smaller 3bit model. Here’s what I got for "Generate an SVG of a pelican riding a bicycle":

Here’s the transcript for that.
In both cases the model used around 48GB of RAM at peak, leaving me with just 16GB for everything else – I had to quit quite a few apps in order to get the model to run but the speed was pretty good once it got going.
Local coding models are really good now
It’s interesting how almost every model released in 2025 has specifically targeting coding. That focus has clearly been paying off: these coding models are getting really good now.
Two years ago when I first tried LLaMA I never dreamed that the same laptop I was using then would one day be able to run models with capabilities as strong as what I’m seeing from GLM 4.5 Air – and Mistral 3.2 Small, and Gemma 3, and Qwen 3, and a host of other high quality models that have emerged over the past six months.
Tags: python, ai, generative-ai, local-llms, llms, ai-assisted-programming, uv, mlx, pelican-riding-a-bicycle

AI Summary and Description: Yes

Summary: The text discusses the new GLM-4.5 model family from Z.ai, highlighting its capabilities in coding and ease of use on standard hardware, which has significant implications for AI-assisted programming and the accessibility of advanced AI models for developers.

Detailed Description: The content provides insights into the latest developments in generative AI models, specifically the GLM-4.5 family released by Z.ai. Here are the major points from the text:

– **Model Specifications**:
– The GLM-4.5 models are described as large-scale, with the smallest variant (GLM-4.5 Air) possessing 106 billion parameters and requiring approximately 205.78GB of storage on Hugging Face.
– An optimized 3bit quantized version (44GB) was created to allow users with 64GB machines to run it effectively.

– **Practical Application**:
– The author successfully executed the model on a two-and-a-half-year-old laptop with 64GB RAM, generating functional code for a Space Invaders game in HTML and JavaScript on the first try.
– The ease of use and efficiency demonstrated by generating code without needing further edits signifies the model’s advanced capabilities in AI-assisted programming.

– **Technical Execution**:
– The process of running the model involved utilizing the main branch of the MLX library with specific commands in Python, showcasing the accessibility of the technology to developers familiar with programming environments.

– **Performance Insights**:
– The model’s response time and memory usage are detailed, noting its peak memory usage of around 48GB, which limits available resources for other applications during execution.
– The text emphasizes the growing efficiency and capability of coding models released recently, indicating a trend toward improved performance in AI-assisted coding tasks.

– **Industry Trends**:
– The author reflects on the progression of AI models, noting that there is a growing focus on coding applications in generative AI models released in 2025, suggesting a significant shift towards tools that aid developers.

These points indicate the relevance of the GLM-4.5 model for professionals in AI, software security, and cloud computing, as it reveals advancements in generative AI’s capabilities and accessibility for programming tasks. The availability and efficacy of such models can influence software development processes, streamline coding tasks, and promote further innovations in AI-assisted programming.

.NET 1 10 2 2025 3 4 5 5 model 5 models 7 a access accessibility Act advanced advanced AI advanced capabilities advancement advancements age AI ai model AI models ai-assisted-programming air alt and anti app Application applications Arch Aria art as assisted assisted coding assisted programming at ated availability benchmark benchmarks Bi bicycle book browser Bug built by C Cache capabilities capability chat China CI CIA Claude Claude Sonnet Claude Sonnet 4 CleaR Cloud cloud computing co code coding coding models coding tasks command commit community Computing Condi content core css Current D day de Debugging demo developer developers development development process Development processes developments e ease of use effective efficiency end environment ERP Ester execution exp face feature features first following for full function g Gemma Gemma 3 Gen generation generative Generative AI generative AI models git GitHub Go gs H hardware health high Highlight http HTTPS hugging Hugging Face Huggingface implications in industry Industry Trend industry trends Influence innovation Innovations insights inter interpret interpreter io Iron IRS ite J Java JavaScript Just k l large led left Li library license llama llm llms lm local low M mac MacBook machine man max memory memory usage Mistral ML mlx Mode model model family model release model specifications model weights models MoE my N new no o oE of off on one open OPM opt optimized ory oS other out output over parameter pay pelican per performance Performance Insights play point pre pro process processes professionals programming programming tasks Progress prompt ps Py Python Q quality quantized Qwen R rack rag rate RCE re real release resource resources response riding right Ro Role RoT row s sam Scale sec security shift Sig Sim size sizes small software software development software security source specific specific commands speed SSE STAR start storage support SVG system T Tags: Task tasks tech technology ted test text the Time to token tokens tool tools Tor TP tracking trends trie two UI up US usage use user Users uth uv V version WAN Ware web weight Well Wi x yt z