Source URL: https://simonwillison.net/2025/Mar/17/mistral-small-31/#atom-everything
Source: Simon Willison’s Weblog
Title: Mistral Small 3.1
Feedly Summary: Mistral Small 3.1
Mistral Small 3 came out in January and was a notable, genuinely excellent local model that used an Apache 2.0 license.
Mistral Small 3.1 offers a significant improvement: it’s multi-modal (images) and has an increased 128,000 token context length, while still “fitting within a single RTX 4090 or a 32GB RAM MacBook once quantized" (according to their model card). Mistral’s own benchmarks show it outperforming Gemma 3 and GPT-4o Mini, but I haven’t seen confirmation from external benchmarks.
Despite their mention of a 32GB MacBook I haven’t actually seen any quantized GGUF or MLX releases yet, which is a little surprising since they partnered with Ollama on launch day for their previous Mistral Small 3. I expect we’ll see various quantized models released by the community shortly.
The model is available via their La Plateforme API, which means you can access it via my llm-mistral plugin.
Here’s the model describing my photo of two pelicans in flight:
llm install llm-mistral
# Run this if you have previously installed the plugin:
llm mistral refresh
llm -m mistral/mistral-small-2503 ‘describe’ \
-a https://static.simonwillison.net/static/2025/two-pelicans.jpg
The image depicts two brown pelicans in flight against a clear blue sky. Pelicans are large water birds known for their long bills and large throat pouches, which they use for catching fish. The birds in the image have long, pointed wings and are soaring gracefully. Their bodies are streamlined, and their heads and necks are elongated. The pelicans appear to be in mid-flight, possibly gliding or searching for food. The clear blue sky in the background provides a stark contrast, highlighting the birds’ silhouettes and making them stand out prominently.
Tags: vision-llms, mistral, llm, generative-ai, ai, llms
AI Summary and Description: Yes
Summary: The text discusses the release of Mistral Small 3.1, a significant advancement in multi-modal AI models capable of handling images and text with an increased context length. Its implications for local model deployment, as well as performance comparisons with existing models, make it noteworthy for professionals in AI and infrastructure security.
Detailed Description: The release of Mistral Small 3.1 marks a substantial advancement in the field of AI, specifically in terms of its usability and capabilities. Below are some major points regarding this new model:
– **Model Capabilities**:
– Mistral Small 3.1 is multi-modal, meaning it can process both text and images, expanding its applicability in different scenarios.
– It boasts an impressive context length of up to 128,000 tokens, allowing for more extensive and sophisticated interactions compared to its predecessors.
– The model’s ability to fit within available computational resources, like a single RTX 4090 or a 32GB RAM MacBook, emphasizes its efficiency and accessibility.
– **Performance Benchmarks**:
– Initial benchmarks indicate that Mistral Small 3.1 outperforms notable models such as Gemma 3 and GPT-4o Mini. While these benchmarks are promising, external validation is still pending.
– **Model Accessibility**:
– Mistral Small 3.1 can be accessed via the La Plateforme API, which enables integration into various applications and workflows, enhancing its usability for developers and data scientists.
– The accompanying instructions for installation and usage imply that it is designed for ease of deployment, a significant factor for professionals managing AI infrastructure.
– **Community Engagement and Future Developments**:
– The expectation of community-driven releases of quantized models suggests a collaborative approach to development in the AI space, encouraging experimentation and wider adoption among developers.
– The reference to using the LLM plugin indicates potential integration with existing systems and workflows, further enhancing its capabilities and use in real-world scenarios.
– **Application Insights**:
– The ability to describe complex images, like the provided example of pelicans in flight, showcases the capability of AI models to interpret and generate detailed information based on visual inputs. This can be relevant for applications in areas such as educational tools, content generation, and enhanced user engagement through intelligent response systems.
This model represents an important development in the generative AI landscape, particularly for those involved in AI security and model deployment, as it brings together advanced capabilities with easy accessibility, which can ultimately foster innovation and growth in the field of AI applications.