GPT - 5: Transforming Ordinary People into Doctors, Yet Magic Remains Non

Netizens are already asking, "When will GPT-6 arrive?"

After much anticipation, GPT-5 finally made its debut at 1 a.m. last night. During the one-hour and ten-minute press conference, OpenAI presented to the world a large model with extremely powerful performance, greater ease of use, and the ability to understand or accurately guess the user's true intentions and deliver products that meet expectations.

In the words of Sam Altman at the press conference, GPT-5 can reach the knowledge level of a doctor in various fields, and its capabilities can rival those of professionals, enabling ordinary people to accomplish tasks that were previously unimaginable.

Compared to when OpenAI released GPT-4 two years ago, the world's understanding and experience of large models have been much more comprehensive. Audiences and users are no longer amazed that the model can understand internet meme images. However, as someone who uses AI products almost every day, the release of GPT-5 is still quite astonishing.

The most important reason is that from the content conveyed at the press conference, I can deeply feel that OpenAI wants to transform the large model from a big toy that "plays" with language and "intelligence" and sometimes brings surprises and frustrations into a reliable helper in life. Just like your mobile phone, if you are away from it, you will deeply feel the inconvenience, discomfort, and even insecurity.

Next, I will use several moments from the press conference to help you understand how all this happened.

If your child asks you to explain Bernoulli's principle in fluid mechanics, previous AIs might give you an article, while GPT-5 can directly create an interactive page for you in one sentence according to your request.

If you want to learn French, GPT-5 can generate a learning app like Duolingo according to your request. You can use it to memorize words and review them through a snake game. If you are not satisfied with the generated app, you can also directly ask GPT-5 to modify it through natural language.

If you are the CFO of a startup, you can ask GPT-5 to generate a detailed and interactive demonstration board of the financial situation based on all your data in about three minutes. All you need is a description of about 100 words. GPT-5 can generate code from scratch to ensure completion. It can guess the form you want to present and automatically optimize the code and the presentation effect.

The most exaggerated part is that during the demonstration, an OpenAI staff member directly used three prompts to make GPT-5 generate a 3D castle model with a shooting game function, and you can also chat with the soldiers on the castle. When you click on the surrounding balloons, you can fire ammunition to burst the balloons, accompanied by explosion sound effects.

From the demonstration, we can see that GPT-5 has fully evolved into a versatile treasure chest that directly outputs professional products.

But to be honest, what touched me the most was OpenAI's introduction of its capabilities in the field of medical health. OpenAI invited a patient who had recovered from three types of cancer to talk about the help GPT-5 had given her during her anti-cancer process.

She said that when she received the diagnosis result, the doctor provided her with several treatment options to choose from. After consulting GPT-5, she really understood her situation. After having a detailed conversation with GPT-5 about her situation, she made the most suitable choice for herself and finally defeated the illness and got a new life. She can hardly imagine how she, without any medical professional knowledge, could have understood the doctor's treatment options and choices without the professional advice of GPT-5, and she doesn't know if she could have survived in the end.

After watching the press conference, it's clear that the large model technology itself has entered a relatively stable development curve. The release of GPT-5 means that OpenAI has no other "magic" to make the capabilities of the large model develop by leaps and bounds. And the cold - weapon war among AI giants will become even more intense in the future.

Performance Introduction

Model System

GPT-5 is no longer a single model but a model system:

• An automatic switcher determines the query intention

• Simple questions are routed to the chat version (for extremely fast response)

• Complex questions are routed to the reasoning version (for in - depth thinking)

It has a 256k token context window, supports text and image input, and supports function calls and structured output.

Currently, when I open my ChatGPT, I find that the model option in the upper - left corner has defaulted to "GPT-5", and the previous - generation models are no longer visible in the drop - down options. As Altman promised before, in the era of GPT-5, there will be no more cumbersome model selection. The model will automatically determine whether the user needs a quick response or in - depth thinking and reasoning in the current situation.

Coding and Writing

OpenAI calls GPT-5 "our most powerful coding model to date". It performs excellently in complex front - end generation and debugging of large code libraries. It usually only needs one prompt to create beautiful and responsive websites, applications, and games, and it can also transform creativity into reality with both aesthetics and elegance.

In addition, OpenAI also says that GPT-5 is "our most powerful writing tool to date", which can write engaging and literarily profound and rhythmic texts. It can more reliably handle writing with fuzzy structures, such as continuous non - rhyming iambic pentameters or smooth and natural free verse, combining respect for form with clear expression. This means that ChatGPT can better help users complete daily tasks, such as drafting and editing reports, emails, memos, etc.

We also briefly tested the poem - writing ability of the new model. Taking "The first cup of milk tea in autumn" as the topic, the result is indeed more natural than that of GPT-4 (less like AI - generated).

Evaluation

The overall intelligence level of GPT-5 has been significantly improved, which is reflected in its performance in academic and manual evaluation benchmark tests, especially in the fields of mathematics, coding, visual perception, and health.

It has set new highest levels in mathematics (scoring 94.6% in the AIME 2025 tool - free test), real - world coding (scoring 74.9% in SWE - bench Verified and 88% in Aider Polyglot), multimodal understanding (scoring 84.2% in MMMU), and health (scoring 46.2% in HealthBench Hard). These improvements are fully reflected in daily use.

With the extended reasoning ability of GPT-5 pro, the model has also set a new highest level on GPQA, scoring as high as 88.4% in the tool - free test.

GPT-5 has reached the top in LMArean.

In the preview access of Intelligence Analysis, GPT-5 also won the first place.

Reduced Hallucination

When the search function is enabled, the probability of GPT-5 making factual errors is about 45% lower than that of GPT-4o. In the "thinking" mode, this probability is 80% lower than that of OpenAI o3.

In addition to factual errors, AI often "lies blatantly". For example, when it can't do something or doesn't have the permission to do it, it may cheerfully tell you that it's done. GPT-5 also performs better and is more honest in terms of "deception". For example, to test this, OpenAI removed all images from the prompts of the multimodal benchmark test CharXiv. It was found that OpenAI o3 could still give confident answers about non - existent images with a probability of 86.7%, while the proportion of GPT-5 was only 9%.

More "Efficient" and More "Economical"

In OpenAI's evaluation, GPT-5 (with thinking ability) performs better than OpenAI o3, and in functions such as visual reasoning, agent coding, and solving scientific problems at the postgraduate level, the number of output tokens is reduced by 50% to 80%.

That is to say, GPT-5 achieves greater value with less thinking time.

API

In terms of API price, GPT-5 has the strongest performance but an incredibly low price. It seems that OpenAI has mastered cross - generation optimization methods.

OpenAI and Altman undoubtedly have high hopes for GPT-5, and they also know that the outside world has been looking forward to this generation of models for a long time.

Altman said that for the first time, it really feels like having a conversation with an expert in a certain field. If GPT-4o is like a college student, then GPT-5 is like a doctor - level expert.

This attention is also reflected in the duration of the press conference. In the past, OpenAI's online press conferences for new models were only about half an hour long, but this time it lasted for more than an hour. And Altman himself also posted "text live - streaming" on X (formerly Twitter) during the press conference.

该文观点仅代表作者本人，36氪平台仅提供信息存储空间服务。