Simon Willison’s Weblog: mlx-community/OLMo-2-0325-32B-Instruct-4bit

Source URL: https://simonwillison.net/2025/Mar/16/olmo2/#atom-everything
Source: Simon Willison’s Weblog
Title: mlx-community/OLMo-2-0325-32B-Instruct-4bit

Feedly Summary: mlx-community/OLMo-2-0325-32B-Instruct-4bit
OLMo 2 32B claims to be “the first fully-open model (all data, code, weights, and details are freely available) to outperform GPT3.5-Turbo and GPT-4o mini". Thanks to the MLX project here’s a recipe that worked for me to run it on my Mac, via my llm-mlx plugin.
To install the model:
llm install llm-mlx
llm mlx download-model mlx-community/OLMo-2-0325-32B-Instruct-4bit

That downloads 17GB to ~/.cache/huggingface/hub/models–mlx-community–OLMo-2-0325-32B-Instruct-4bit.
To start an interactive chat with OLMo 2:
llm chat -m mlx-community/OLMo-2-0325-32B-Instruct-4bit

Or to run a prompt:
llm -m mlx-community/OLMo-2-0325-32B-Instruct-4bit ‘Generate an SVG of a pelican riding a bicycle’ -o unlimited 1

The -o unlimited 1 removes the cap on the number of output tokens – the default for llm-mlx is 1024 which isn’t enough to attempt to draw a pelican.
The pelican it drew is refreshingly abstract:

Via @awnihannun
Tags: llm, generative-ai, mlx, ai2, ai, llms, pelican-riding-a-bicycle

AI Summary and Description: Yes

Summary: The provided text discusses the OLMo 2 model, touted as the first fully open AI model that outperforms existing models like GPT-3.5-Turbo and GPT-4o mini. It details how users can install and run the model on their local machines, highlighting the ease of access and the flexibility of the model in generating specific outputs, which is particularly relevant for developers and researchers in the AI and machine learning fields.

Detailed Description: The text presents essential information about the OLMo 2 generative AI model, developed under the MLX project, emphasizing its fully open nature. Here are the key points:

– **Model Overview**:
– OLMo 2 32B is characterized as a pioneering fully-open model, meaning all its components (data, code, weights) are available for public use.
– It claims to outperform notable models such as GPT-3.5-Turbo and GPT-4o mini, providing a competitive alternative for developers.

– **Installation Instructions**:
– The text gives a step-by-step guide on how to install the model using the `llm-mlx` plugin.
– Users can download the model easily, with a command that streamlines the process:
– `llm install llm-mlx`
– `llm mlx download-model mlx-community/OLMo-2-0325-32B-Instruct-4bit`
– The model size is significant at 17GB, indicating substantial data processing capabilities.

– **Usage**:
– Clear examples of how to start an interactive chat with OLMo 2 are provided:
– Command: `llm chat -m mlx-community/OLMo-2-0325-32B-Instruct-4bit`
– The text also includes how to run prompts and modify output parameters, such as removing token limits to allow for better creativity in outputs:
– Command: `llm -m mlx-community/OLMo-2-0325-32B-Instruct-4bit ‘Generate an SVG of a pelican riding a bicycle’ -o unlimited 1`

– **Implications for Professionals**:
– This model can potentially lower the barrier for entry into high-quality AI model development, encouraging innovation and experimentation in generative AI applications.
– The focus on open-source transparency aligns with increasing industry trends towards open collaboration and shared resources.

Overall, this text is significant for AI professionals, especially those involved in developing or utilizing generative AI models. Its emphasis on an open-source approach provides a fresh perspective in a tech landscape often characterized by proprietary technologies.