StartseiteArtikel

Kürzlich hat das interne Inferenzmodell von OpenAI die Goldmedaille bei der IOI 2025 gewonnen und die erste Stelle unter allen Künstlichen-Intelligenz-Teilnehmern belegt.

新智元2025-08-12 11:48
After winning the gold medal in the IMO, he/she has now also secured the gold medal in the IOI.

OpenAI's internal inference model has won a gold medal at IOI 2025, defeating 325 human contestants. It ranked 6th overall and 1st among AI participants. This model is based on the version that won the IMO gold medal, without any specialized training. The competition had a 5 - hour time limit, allowed 50 submissions, and did not provide internet access.

Just now, after winning the IMO gold medal, OpenAI's internal inference model has also won a gold medal at IOI.

As with the previous IMO competition, OpenAI used a strawberry image to represent this inference model.

This time, the "strawberry" not only wears an IOI gold medal but also has a more anthropomorphic appearance. This image is likely to evolve into the representative image of OpenAI's internal inference system.

This "internal inference system" declared by OpenAI is the same model that won the IMO gold medal and caused controversy last time.

After the IMO competition, OpenAI conducted a comprehensive evaluation of the IMO gold - medal model and found that, apart from math competitions, it is currently the best model in many other fields, including programming.

Therefore, OpenAI decided to directly use the exact same IMO gold - medal model without any changes and apply it to the IOI system.

OpenAI's official post also confirmed this news.

This internal inference model scored high enough to rank 6th overall (including human contestants) and 1st among AI participants in this year's online IOI competition.

Sheryl Hsu said that this internal model participated in the online AI competition at IOI, with a total of 330 contestants.

The top 5 contestants were all human.

In this competition, like human contestants, AI contestants had the same 5 - hour time limit and a maximum of 50 submissions.

Moreover, like humans, this inference system did not have "internet access" or "RAG" search and could only access basic terminal tools.

This inference model was not specifically trained for IOI.

That is to say, apart from connecting the model to the IOI API, the AI had to rely on its own inference for everything else.

Actually, OpenAI participated in the IOI competition last year and ended up slightly below the bronze medal cutoff.

In just one year, the ranking of the inference model has jumped from the 49th percentile to the 98th percentile.

OpenAI's internal inference model - IOI Gold Medal Team

However, not long after this news was released.

Elon Musk's Grok has also joined the fray!

First of all, it should be clear that this "internal inference model" is not a consumer - facing model, and no one outside OpenAI can access it.

So, how do the current top - tier commercial models perform at IOI?

The answer is: terribly.

According to the test results of Vals AI, the commercial model that currently leads at IOI is actually Grok 4.

First of all, all current top - tier models have obvious deficiencies, and no model has won a medal in any year's competition.

Grok 4 leads with an accuracy of 26.2%, followed by GPT - 5, Gemini 2.5 Pro, and Claude Opus 4.1.

Vals AI tested these models through their public endpoints, and all commercial models still have a lot of room for improvement at IOI.

Moreover, in this test, Vals AI found that the principle of "you get what you pay for" also applies to the field of large models.

Only expensive models that cost more than $2 per question can achieve meaningful performance.

That is to say, the inference model in OpenAI's laboratory is far more powerful than the commercial models currently available to the public.

This may lead people to wonder how far the most advanced AI technology in the top - tier laboratories is from the public.

This has sparked a lot of speculation and discussion.

From the IMO gold medal fiasco, we can see that tech giants are very eager to achieve this "leading position".

To prove that Google Gemini is the "first AI model to win an IMO gold medal", the organizing committee even announced that "OpenAI's claim" was invalid.

There was even a story where OpenAI was accused of faking the IMO gold medal, and Terence Tao exposed the inside story.

Currently, just after the release of GPT - 5, OpenAI immediately announced the IOI gold medal. It can be predicted that this will be a challenge for future models like Gork 5 and Gemini 3.

Why are tech giants like OpenAI, Google, Anthropic, and Grok so obsessed with topping the charts and winning competitions?

The obsession of tech giants with topping the charts and competition rankings fundamentally stems from the high competitiveness of the AI industry and the rapid iteration of technology.

First of all, topping the charts is one of the most direct and effective marketing methods.

A leading position in the rankings not only means a technological advantage but also represents market influence and brand recognition. Once a model achieves excellent results in authoritative competitions such as IMO and IOI, the company can quickly establish a strong brand image, attract public attention, and enhance user trust.

Secondly, competition rankings in the AI field are usually highly correlated with the general performance and application potential of the model. Whether it is IMO or IOI, these competitions test the model's basic reasoning, logical deduction, and generalization abilities.

In other words, winning a competition means that the model not only performs well in specific tasks but also may have a leading technological advantage in a wider range of application scenarios.

Finally, winning a competition can greatly increase the attractiveness to talent and capital.

The OpenAI team traveled to Bolivia to participate in IOI in person

For this reason, AI giants such as OpenAI, Google DeepMind, Meta, and Anthropic are always eager to compete with each other in competitions. Every change in the rankings may affect the future landscape of the AI industry.

So, who is the strongest AI on earth?

Maybe this competition will continue until the day we achieve AGI.

References

https://x.com/SherylHsu02/status/1954966118680105150

This article is from the WeChat public account "New Intelligence Yuan", author: Ding Hui. It is published by 36Kr with authorization.