GPT-5 Makes History: Wins Global Top Competition with Humans Second, Beijing Jiaotong University Team Tops in China

OpenAI and Google are competing from afar, and both have won the gold medals.

According to Zhidx on September 18th, early this morning, OpenAI and Google announced one after another that their models achieved gold - level performances in the finals of the globally renowned programming competition ICPC 2025 (the 49th International Collegiate Programming Contest).

OpenAI's inference system answered all 12 questions correctly. It got 11 questions right on the first try, and solved the most difficult question after 9 submissions. Compared with human teams, it would rank first; The advanced version of Gemini 2.5 Deep Think solved 10 questions in 677 minutes and would rank second compared with human teams.

If AI were included in the overall ICPC rankings, the top three would be OpenAI's inference system, St. Petersburg State University, and the advanced version of Google Gemini 2.5 Deep Think.

The requirement of ICPC is that contestants solve 12 complex algorithm problems within 5 hours. Whether the solution is perfect and the time taken to solve the problem will both affect the score.

Finally, among the 139 teams, the top four won the gold medal, namely St. Petersburg State University, the University of Tokyo, Beijing Jiaotong University, and Tsinghua University. St. Petersburg State University solved the most questions, a total of 11.

Human teams that won the gold medal in ICPC

This is another instance where OpenAI's inference system and Google's Gemini 2.5 Deep Think demonstrated their strength in a top - tier international competition, following their participation in the International Mathematical Olympiad (IMO) two months ago.

The code of the advanced version of Google Gemini 2.5 Deep Think used in the ICPC finals has been open - sourced on GitHub.

GitHub address:

https://github.com/google - deepmind/gemini_icpc2025

01. OpenAI Scored a Perfect Score

Google Missed Two Questions

ICPC is the world's most recognized, oldest, largest, and most prestigious university - level algorithm programming competition. Each year, participants from nearly 3,000 universities and over 103 countries compete to solve real - world programming problems.

Both OpenAI and Google participated and achieved gold - level performances. OpenAI's inference system answered 12 questions, the advanced version of Google Gemini 2.5 Deep Think answered 10 questions, and the best human team answered 11 questions.

1. OpenAI: Scored a perfect score, got 11 questions right on the first try

OpenAI's inference system achieved a perfect score.

OpenAI mentioned that it did not specifically train the model for ICPC. Instead, it used a combination of general inference models to participate in the competition.

During the competition, GPT - 5 and an experimental inference model jointly generated solutions, and the experimental inference model was responsible for screening the solutions to be submitted. Finally, GPT - 5 correctly answered 11 questions, and the last and most difficult question was solved by this experimental inference model.

Its model got 11 questions right on the first try, and the most difficult question was solved on the 9th submission.

2. Google: Answered 10 questions, solved 8 questions in 45 minutes

The advanced version of Gemini 2.5 Deep Think participated in the live competition in a remote online environment according to ICPC rules, starting 10 minutes later than human contestants. Gemini spent a total of 677 minutes to solve 10 out of 12 problems, with 8 problems taking 45 minutes and the other 2 problems taking 3 hours.

The following figure shows the time taken to solve each problem in the 2025 ICPC finals. The time taken by Gemini is shown in blue, and the time of the fastest college student team is shown in gray.

Gemini took longer than humans to solve 3 questions.

Time taken to solve each problem in the ICPC finals

In addition, Google DeepMind also mentioned that Gemini successfully solved a difficult problem that stumped all human teams within half an hour.

Question C required the team to design a solution to transport liquid to a group of liquid storage tanks through an interconnected pipeline network. The goal was to find a pipeline configuration that could fill all the storage tanks in the shortest time.

There were infinitely many possible configurations for this problem because each pipeline could be in an open, closed, or even partially open state, which made it extremely difficult to find the optimal configuration.

Introduction to Question C

Gemini found an effective solution: it first assumed that each reservoir had a "priority value", representing the priority it should receive compared with other reservoirs.

Given a set of priority values, a dynamic programming algorithm could be used to find the optimal pipeline configuration.

Gemini found that by applying the Minimax Theorem, the original problem could be transformed into finding the priority values that would impose the greatest constraint on the final flow.

With the relationship between the priority values and the optimal flow, Gemini quickly found the optimal priority values in a convex solution space similar to a bowl - shaped one through Nested Ternary Searches, and finally solved Question C.

Currently, Gemini users who subscribe to Google AI Ultra can use the lightweight version of Gemini 2.5 Deep Think in the Gemini App.

02. ICPC Gold - Level Performance

Demonstrates the Abstract Reasoning Ability of Large Models

Google DeepMind's blog mentioned that Gemini's performance was due to its technological innovations in pretraining, post - training, reinforcement learning techniques, multi - step reasoning, and parallel thinking.

For example, during the reinforcement learning process, researchers trained Gemini to reason and generate code for some of the most difficult problems faced by programmers, and learned from the feedback of the results to improve its methods. To solve a problem, multiple Gemini Agents would each propose their own solutions, execute and test the code using the terminal, and then iterate on the solutions based on all the attempts.

Google DeepMind's internal research shows that the advanced version of Gemini 2.5 Deep Think could also achieve gold - level performances in the 2023 and 2024 ICPC World Finals, performing no worse than the world's top 20 competitive developers.

Achieving a gold - level performance in ICPC has direct practical implications for software development. If the best AI and human solutions in the competition were combined, all 12 problems would be thoroughly and correctly solved. This indicates that AI has the potential to provide unique ideas and complement human experts.

In addition to mathematics and programming, the advanced version of Gemini 2.5 Deep Think also demonstrated its ability in abstract reasoning.

This is because ICPC problems require the model to understand complex problems, design multi - step logical plans, and implement them perfectly. This process requires the same skills as those needed in many scientific and engineering fields, including the design of new drugs or microchips.

OpenAI researchers posted on X that they used the same set of models to participate in the IMO and IOI competitions, demonstrating the model's performance and versatility.

03. Conclusion: The Ability of Large Models to Solve Complex and Abstract Problems is Improving

From the International Mathematical Olympiad (IMO) to this programming competition, OpenAI's and Google's models have shown great potential in solving more challenging mathematical and reasoning problems. Dr. Bill Poucher, the global executive director of ICPC, said that ICPC has always been committed to setting the highest standards in problem - solving. Gemini's achievements in this field mark a crucial moment in defining the next - generation AI tools and academic standards.

These breakthroughs in competitive programming and mathematical reasoning together prove the leap in the performance of large models in solving abstract reasoning problems. They may be combined with human experts to solve more complex problems.

This article is from the WeChat official account "Zhidx" (ID: zhidxcom). Author: Cheng Qian, Editor: Li Shuiqing. Republished by 36Kr with permission.

该文观点仅代表作者本人，36氪平台仅提供信息存储空间服务。

Making history, GPT-5 won the global top competition, with humans coming in second, and the team from Beijing Jiaotong University ranking first in China.

01.

OpenAI Scored a Perfect Score

Google Missed Two Questions

02.

ICPC Gold - Level Performance

Demonstrates the Abstract Reasoning Ability of Large Models

03.

Conclusion: The Ability of Large Models to Solve Complex and Abstract Problems is Improving