Simon Willison’s Weblog: OpenAI platform: o1-pro

Mar 20, 2025

—

Source URL: https://simonwillison.net/2025/Mar/19/o1-pro/
Source: Simon Willison’s Weblog
Title: OpenAI platform: o1-pro

Feedly Summary: OpenAI platform: o1-pro
OpenAI have a new most-expensive model: o1-pro can now be accessed through their API at a hefty $150/million tokens for input and $600/million tokens for output. That’s 10x the price of their o1 and o1-preview models and a full 1,000x times more expensive than their cheapest model, gpt-4o-mini!
Aside from that it has mostly the same features as o1: a 200,000 token context window, 100,000 max output tokens, Sep 30 2023 knowledge cut-off date and it supports function calling, structured outputs and image inputs.
o1-pro doesn’t support streaming, and most significantly for developers is the first OpenAI model to only be available via their new Responses API. This means tools that are built against their Chat Completions API (like my own LLM) have to do a whole lot more work to support the new model – my issue for that is here.
Since LLM doesn’t support this new model yet I had to make do with curl:
curl https://api.openai.com/v1/responses \
-H “Content-Type: application/json" \
-H "Authorization: Bearer $(llm keys get openai)" \
-d ‘{
"model": "o1-pro",
"input": "Generate an SVG of a pelican riding a bicycle"
}’

Here’s the full JSON I got back – 81 input tokens and 1552 output tokens for a total cost of 94.335 cents.

Tags: o1, llm, openai, inference-scaling, ai, llms, llm-release, generative-ai, pelican-riding-a-bicycle, llm-pricing

AI Summary and Description: Yes

Summary: The introduction of OpenAI’s new model, o1-pro, significantly alters the pricing structure for large language models (LLMs) and establishes new API requirements for developers. This shift underscores the evolving monetization strategies in AI and the need for developers to adapt to changing API frameworks.

Detailed Description:

The text discusses OpenAI’s latest high-cost model, o1-pro, which reflects a substantial increase in pricing compared to previous models. Here’s a breakdown of the major points:

– **New Model Pricing**:
– The cost is $150 per million tokens for input and $600 for output, marking a dramatic increase—10 times that of the o1 and o1-preview models, and 1,000 times the cost of the cheapest model, gpt-4o-mini.

– **Model Features**:
– It retains many features of its predecessor o1, such as:
– A context window of 200,000 tokens.
– A maximum output of 100,000 tokens.
– Knowledge cut-off date of September 30, 2023.
– Support for function calling, structured outputs, and image inputs.

– **API Transition**:
– Importantly, o1-pro does not support streaming and is only accessible via the new Responses API, deviating from the Chat Completions API used by many existing tools and applications.
– Developers using tools designed for Chat Completions API must implement significant changes to accommodate the new model.

– **Developer Interaction**:
– The post illustrates a real-world example of using the API via a `curl` command to show how developers interact with the system under the new pricing structure. It also highlights the practical implications of cost, providing an example wherein 81 input tokens and 1552 output tokens resulted in a charge of 94.335 cents.

– **Keywords and Tags**:
– The inclusion of tags such as o1, llm, openai, generative-ai, and llm-pricing emphasizes the relevance of this new model in the current discourse on AI pricing and infrastructure.

The implications of these changes are profound for organizations leveraging AI, as increased costs and adjustments in API structure may burden developers and influence decision-making regarding model deployments and applications in areas of AI security and infrastructure. This transition places emphasis on the need for ongoing training and adaptation in the landscape of AI development.

-4o .NET 1 10 2 2025 3 4 5 a access Act adaptation AGI AI AI development ai model AI security alt and anti API Application applications as authorization by C chat chat completions co command content Context context window core cost Costs Curl Current D de decision decision-making deployment design developer developers development e edge exp feature features first for framework frameworks full function calling g Gen generative Go GPT GPT-4o gs H heap high Highlight HR http HTTPS image implications in Inclusion Inference Influence infrastructure inter interaction IRS J json Just k Key keys knowledge l land language language model language models large large language model large language models Large Language Models (LLMs) led Li llm llm-pricing llms lm making man max mini Mode model model deployment models monetization monetization strategies my N no NPU o o1 o1-preview oE of off on one only open openai OPM organization organizations out Outputs platform point post practical implications pre Preview price pricing pricing structure R rag Rama rate RCE real red release Requirements response responses Ro s scaling sec security shift side Sig Sim source SSE SSO Streaming structured structured output structured outputs SVG system T Tags: test text the Time to token tokens tool tools TP training transition type UI up US use uth V web Wi Wind x