Source URL: https://simonwillison.net/2025/Sep/17/icpc/#atom-everything
Source: Simon Willison’s Weblog
Title: ICPC medals for OpenAI and Gemini
Feedly Summary: In July it was the International Math Olympiad (OpenAI, Gemini), today it’s the International Collegiate Programming Contest (ICPC). Once again, both OpenAI and Gemini competed with models that achieved Gold medal performance.
OpenAI’s Mostafa Rohaninejad:
We received the problems in the exact same PDF form, and the reasoning system selected which answers to submit with no bespoke test-time harness whatsoever. For 11 of the 12 problems, the system’s first answer was correct. For the hardest problem, it succeeded on the 9th submission. Notably, the best human team achieved 11/12.
We competed with an ensemble of general-purpose reasoning models; we did not train any model specifically for the ICPC. We had both GPT-5 and an experimental reasoning model generating solutions, and the experimental reasoning model selecting which solutions to submit. GPT-5 answered 11 correctly, and the last (and most difficult problem) was solved by the experimental reasoning model.
And here’s the blog post by Google DeepMind’s Hanzhao (Maggie) Lin and Heng-Tze Cheng:
An advanced version of Gemini 2.5 Deep Think competed live in a remote online environment following ICPC rules, under the guidance of the competition organizers. It started 10 minutes after the human contestants and correctly solved 10 out of 12 problems, achieving gold-medal level performance under the same five-hour time constraint. See our solutions here.
I’m still trying to confirm if the models had access to tools in order to execute the code they were writing. The IMO results in July were both achieved without tools.
Tags: gemini, llm-reasoning, google, generative-ai, openai, ai, llms
AI Summary and Description: Yes
Summary: The text discusses the performance of AI models from OpenAI and Google DeepMind (Gemini) at prestigious programming contests, showcasing their capabilities in problem-solving and reasoning. This highlights advancements in the realm of AI, particularly in generative AI technologies relevant to software security and information security professionals.
Detailed Description: The provided text primarily describes the competitive performance of AI models, specifically from OpenAI and Google DeepMind, in two significant contests — the International Math Olympiad and the International Collegiate Programming Contest (ICPC). Here are the notable points:
– **AI Competitiveness**: Both OpenAI and Google DeepMind’s models achieved gold-medal level results at the ICPC, demonstrating that AI can effectively tackle complex algorithmic problems.
– **Performance Details**:
– OpenAI’s model processed problems from PDF documents and used a reasoning system to determine answers with impressive accuracy (correctly solving 11 out of 12 problems).
– The success of OpenAI models was augmented by an experimental reasoning model that made decisions regarding which solutions to submit.
– Google DeepMind’s Gemini 2.5 Deep Think model also performed remarkably, solving 10 out of 12 problems under competition constraints.
– **Operational Context**:
– Both models competed within strict rules typical of human contests, initiating their problem-solving efforts simultaneously alongside human participants.
– The ability of these models to tackle programming problems without bespoke training tailored for the ICPC indicates a high level of generalization in their training.
– **Technical Insights**:
– There’s an ongoing inquiry about whether the AI models used any external tools during problem-solving, especially since the previous competition saw successes achieved without such tools.
Key implications for professionals in AI, cloud, and infrastructure security include:
– **Understanding AI Capabilities**: The high-performance levels of these AI systems can aid in recognizing their potential and limitations regarding software security challenges.
– **Integration Potential**: The applications of such generative AI models could extend to automating complex tasks usually performed by developers, raising questions of trust and the need for robust security measures around their deployment.
– **Future Innovations**: The competitive success illustrates a growing trend in leveraging AI for high-stakes problem-solving, necessitating ongoing discussions around security, compliance, and ethical considerations in AI usage.
In conclusion, the results from these programming contests reflect not only the state of AI advancement but also the need for professionals in the field to stay abreast of how these models can impact software and information security strategies.