Simon Willison’s Weblog: Grok 4 Fast

Sep 21, 2025

—

Source URL: https://simonwillison.net/2025/Sep/20/grok-4-fast/
Source: Simon Willison’s Weblog
Title: Grok 4 Fast

Feedly Summary: Grok 4 Fast
New hosted reasoning model from xAI that’s designed to be fast and extremely competitive on price. It has a 2 million token context window and “was trained end-to-end with tool-use reinforcement learning".
It’s priced at $0.20/million input tokens and $0.50/million output tokens – 15x less than Grok 4 (which is $3/million input and $15/million output). That puts it cheaper than GPT-5 mini and Gemini 2.5 Flash on llm-prices.com.
The same model weights handle reasoning and non-reasoning based on a parameter passed to the model.
I’ve been trying it out via my updated llm-openrouter plugin, since Grok 4 Fast is available for free on OpenRouter for a limited period.
The non-reasoning model:
llm -m openrouter/x-ai/grok-4-fast:free "Generate an SVG of a pelican riding a bicycle"

And the reasoning model:
llm -m openrouter/x-ai/grok-4-fast:free "Generate an SVG of a pelican riding a bicycle" -o reasoning_enabled true

In related news, the New York Times had a story a couple of days ago about Elon’s recent focus on xAI: Since Leaving Washington, Elon Musk Has Been All In on His A.I. Company.
Tags: ai, generative-ai, llms, llm, llm-pricing, pelican-riding-a-bicycle, llm-reasoning, grok, llm-release, openrouter

AI Summary and Description: Yes

Summary: The text introduces Grok 4 Fast, a new hosted reasoning model from xAI, highlighting its competitive pricing and capabilities. Professionals in AI and cloud security should note the model’s unique features and its impact on pricing trends in the AI landscape.

Detailed Description: The release of Grok 4 Fast is significant within the AI sector, particularly for those interested in large language models (LLMs). Here are the key points and implications for security and compliance professionals:

– **Model Overview**: Grok 4 Fast is a hosted reasoning model designed to be highly efficient and competitively priced.
– **Technical Specifications**:
– Supports a **2 million token context window**, allowing for substantial input capacity.
– Uses an **end-to-end tool-use reinforcement learning** approach for training, which can enhance model efficacy in specific applications.
– **Pricing Structure**:
– Priced at **$0.20/million input tokens** and **$0.50/million output tokens**. This price point is significantly lower than its predecessor, Grok 4, and competitive with other models like GPT-5 mini and Gemini 2.5 Flash.
– Such pricing could encourage broader adoption of reasoning models, potentially democratizing access to advanced AI capabilities.
– **Use-Cases**:
– The model can handle both reasoning and non-reasoning tasks based on parameters set by users, indicating its versatility.
– Example commands illustrate its practical applications, such as generating SVG artwork with specified reasoning.

– **Market Context**:
– The release comes amid increased interest in AI technologies and raises considerations about xAI’s strategic direction under Elon Musk’s vision, as noted in recent media.

The combination of competitive pricing, advanced capabilities, and the context of prominent figures in AI makes Grok 4 Fast a relevant development for professionals in AI, security, and compliance fields. Monitoring such advancements is crucial as they could influence data governance, compliance strategies, and the overall landscape of AI technologies.

.NET 1 2 2025 3 4 5 5 flash a access Act adoption advanced advanced AI advanced capabilities advancement advancements age AI AI capabilities AI landscape AI technologies All allow and anti app Application applications art as at ated based Bi bicycle bot by C capabilities capacity CI CIA Cloud cloud security co command competitive competitive pricing compliance compliance professionals compliance strategies Context context window D data data governance day days de demo design development e efficient Elon Musk end fast feature features flash for free g Gemini Gemini 2 Gen generative Go governance GPT Grok gs H heap high Highlight hosted http HTTPS impact implications implications for security in Influence Inforce inter io ite k Key l land language language model language models large large language model large language models Large Language Models (LLMs) learning led Li llm llm-pricing llms lm low M man market market context media mid mini Mode model model design model weights models Monitor monitoring my N nation new New York news no non NPU o OCR of on ons open openrouter OPM opt ory oS other out output over parameter pelican per plugin point potential practical application practical applications pre price pricing pricing structure pro professionals ps Q R rag Raise rate RCE re reasoning reasoning mode reasoning model reasoning models reasoning tasks red reinforcement reinforcement learning release riding Ro RSA s sam sec sector security security and compliance side Sig Sim Simon Willison source specific SSE SSO strategic strategies support SVG T Tags: Task tasks tech technical technical specifications technologies ted text the Time times to token token context tokens tool Tor TP trained training trends two under up update US use user Users V versatility Vision web weight Wi Wind x XAI york z