Simon Willison’s Weblog: OpenAI API: Responses vs. Chat Completions

Mar 11, 2025

—

Source URL: https://simonwillison.net/2025/Mar/11/responses-vs-chat-completions/#atom-everything
Source: Simon Willison’s Weblog
Title: OpenAI API: Responses vs. Chat Completions

Feedly Summary: OpenAI API: Responses vs. Chat Completions
OpenAI released a bunch of new API platform features this morning under the headline “New tools for building agents" (their somewhat mushy interpretation of "agents" here is "systems that independently accomplish tasks on behalf of users").
A particularly significant change is the introduction of a new Responses API, which is a slightly different shape from the Chat Completions API that they’ve offered for the past couple of years and which others in the industry have widely cloned as an ad-hoc standard.
In this guide they illustrate the differences, with a reassuring note that:

The Chat Completions API is an industry standard for building AI applications, and we intend to continue supporting this API indefinitely. We’re introducing the Responses API to simplify workflows involving tool use, code execution, and state management. We believe this new API primitive will allow us to more effectively enhance the OpenAI platform into the future.

An API that is going away is the Assistants API, a perpetual beta first launched at OpenAI DevDay in 2023. The new responses API solves effectively the same problems but better, and assistants will be sunset "in the first half of 2026".
The most important feature of the Responses API (a feature it shares with the old Assistants API) is that it can manage conversation state on the server for you. An oddity of the Chat Completions API is that you need to maintain your own records of the current conversation, sending back full copies of it with each new prompt. You end up making API calls that look like this (from their examples):
{
"model": "gpt-4o-mini",
"messages": [
{
"role": "user",
"content": "knock knock.",
},
{
"role": "assistant",
"content": "Who’s there?",
},
{
"role": "user",
"content": "Orange.",
}
]
}
These can get long and unwieldy, but the real challenge is when you start integrating tools: in a conversation with tool use you’ll need to maintain that full state and drop messages in that show the output of the tools the model requested. It’s not a trivial thing to work with.
The new Responses API continues to support this list of messages format, but you also get the option to outsource that to OpenAI entirely: you can add a new "store": true property and then in subsequent messages include a "previous_response_id: response_id key to continue that conversation.
This feels a whole lot more natural than the Assistants API, which required you to think in terms of threads, messages and runs to achieve the same effect.
New built-in tools
A potentially more exciting change today is the introduction of default tools that you can request while using the new Responses API. There are three of these, all of which can be specified in the "tools": […] array.

{"type": "web_search_preview"} – the same search feature available through ChatGPT. The documentation doesn’t clarify which underlying search engine is used – I initially assumed Bing, but the tool documentation links to this Overview of OpenAI Crawlers page so maybe it’s entirely in-house now? Web search is priced at between $25 and $50 per thousand queries depending on if you’re using GPT-4o or GPT-4o mini and the configurable size of your "search context".
{"type": "file_search", "vector_store_ids": […]} provides integration with the latest version of their file search vector store, mainly used for RAG. "Usage is priced⁠ at $2.50 per thousand queries and file storage at $0.10/GB/day, with the first GB free".
{"type": "computer_use_preview", "display_width": 1024, "display_height": 768, "environment": "browser"} is the most surprising to me: it’s tool access to the Computer-Using Agent system they built for their Operator product. This one is going to be a lot of fun to explore. The tool’s documentation includes a warning about prompt injection risks.

I’m still thinking through how to expose these new features in my LLM tool, which is made harder by the fact that a number of plugins now rely on the default OpenAI implementation from core, which is currently built on top of Chat Completions. I’ve been worrying for a while about the impact of our entire industry building clones of one proprietary API that might change in the future, I guess now we get to see how that shakes out!
Tags: chatgpt, generative-ai, openai, apis, ai, llms, ai-agents, llm-tool-use, llm, rag

AI Summary and Description: Yes

Summary: OpenAI has introduced a new Responses API, enhancing the functionality of its platform for AI applications by improving how conversation state is managed. This change is significant for developers as it could streamline workflows in tool use and code execution, while introducing built-in tools that cater to various tasks.

Detailed Description:

The text outlines recent updates to OpenAI’s API platform pertaining to AI application development, focusing particularly on the introduction of the new Responses API. This API is designed to improve user experience by managing conversation states more efficiently, simplifying technical processes that developers face when integrating tools.

Key Points:

– **New API Introduction:**
– OpenAI announced the launch of the Responses API, which aims to enhance user experience by simplifying workflows.
– The Responses API allows for server-side management of the conversation state, contrary to the Chat Completions API that requires users to maintain this state locally.

– **Transition from Old APIs:**
– The older Assistants API, which is being phased out by mid-2026, will be replaced by the more effective Responses API.
– Developers previously struggled with cumbersome processes when APIs required the entire conversation to be sent with each prompt.

– **Enhanced Functionality:**
– The new API offers an optional configuration that allows users to use a property to store conversation states, significantly reducing complexity in code execution, especially when multiple tools are integrated.

– **Built-in Tools:**
– The Responses API introduces default tools that can be utilized in queries, which includes:
1. **Web Search Tool:** Allows for direct web searching capabilities, with pricing based on query count.
2. **File Search Tool:** Facilitates integration with a vector store for retrieval-augmented generation (RAG), with associated costs based on usage.
3. **Computer-Using Agent:** Offers access to the capabilities of OpenAI’s Operator product, which can be leveraged for various tasks.

– **Risk Considerations:**
– The inclusion of built-in tools comes with warnings, particularly about prompt injection risks, an important consideration for security and compliance professionals.

– **Industry Implications:**
– The launch prompts reflection on the industry reliance on proprietary APIs and the need for developers to adapt to changing standards as proprietary tools evolve.

This update is especially relevant for professionals in AI, infrastructure security, and cloud computing, as it not only enhances operational efficiency but also necessitates heightened awareness of potential security vulnerabilities related to API management and tool integration.

-4o .NET 1 2 24 3 4 5 7 a access Act ads agent agent system agents AGI AI AI applications AI implementation ai-agents and API API management APIs Application application development applications Arch art as assistant assistants Augment augmented generation awareness based being bing browser building by C capabilities chat chat completions ChatGPT CIA Cloud cloud computing code code execution complexity compliance compliance professionals compute computer Computing Configuration content Context conversation core cost Costs crawler crawlers Current D day de DeFi design developer developers development document documentation e effective efficiency efficient end environment ERP execution exp experience face fact fault feature features file search first for free full functionality future g Gen generation generative Go GPT GPT-4o gs H HR http HTTPS implementation implications in Inclusion industry industry implications infrastructure infrastructure security injection integration inter interpret IRS ite J k Key l led Li Link llm llms lm local long low making man management mini ML Mode model multi my N no NoC o oE of off on one open openai operation operational efficiency Operator OPM opt out over platform play plugin plugins point porting potential pre Preview price pricing problem process processes product professionals prompt prompts R rag rate Ray RCE real red Reflection release response responses retrieval Retrieval-Augmented Generation Risk risks Ro Role RSA s search search engine search feature Search tool sec security security and compliance Security Vulnerabilities server SHA side Sig Sim SoC source SSE standards start state state management storage system systems T Tags: Task tasks tech test text the Threads to tool tool use tools Tor TP transition trie type UI up update updates US usage use user user experience Users V val vector store version vulnerabilities Ware web web search Wi workflow workflows x