GPT - 5 Arrives: Musk Expresses Doubt

Artificial intelligence giant OpenAI has finally launched the long-awaited new-generation flagship model, GPT-5, recently.

The AI giant OpenAI has finally launched the long - awaited new generation of flagship model, GPT - 5.

At the press conference, CEO Sam Altman used a rather dramatic analogy to describe its leap: "GPT - 3 is like a high school student, occasionally having flashes of inspiration but lacking stability; GPT - 4 is like a college student, combining intelligence and practicality; while GPT - 5 is like having a conversation with a doctoral - level expert."

He also self - deprecatingly said that he was "useless in front of GPT - 5", which really whetted the appetite of global users. This upgrade, which people had been waiting for two and a half years, has finally arrived.

Actually, the birth of GPT - 5 was not easy. It went through a difficult one - year R & D process. The core team was poached with high salaries, and the computing power cost was incredibly high.

OpenAI has had a really anxious and difficult journey. Now, it has finally presented this "unified system", and everyone is waiting to see what new tricks it can bring.

Generally speaking, the biggest change in GPT - 5 is that it has evolved from a "chatbot" into a real "all - around assistant" that can get things done.

First of all, the multi - modality upgrade has made this new assistant much smarter. In the past, you had to talk to it and send pictures separately. Now, it can handle "listening, speaking, reading, writing, and seeing" all at once. It can immediately understand whatever you throw at it.

More importantly, it has learned to "take action". In the past, GPT could only give you advice. Now, GPT - 5 can directly connect to your other software, help you operate Office, write code in development tools, and even handle work processes.

To make this assistant less rigid, OpenAI has preset different "personalities" for it. You can switch it to a sarcastic, rigorous, or understanding mode at any time.

Since DeepSeek pioneered the "Think mode", making large - scale models "think" before answering seems to have become an industry standard, and various companies have launched similar functions.

But GPT - 5 is a bit different. It has made this process more straightforward.

You don't need to specifically look for or click a button. As long as you add the phrase "think carefully" when asking a question, the system will automatically switch to the gpt - 5 - thinking mode, which is better at in - depth analysis, and use stronger "brainpower" to solve your problems.

Then comes the traditional highlight of GPT: topping the charts. OpenAI presented 25 lists, and the dense charts show how amazing GPT - 5 is in various dimensions.

Factual hallucinations have been significantly reduced, 44% less than GPT - 4o and 78% less than GPT - 3. Basically, it doesn't spout nonsense anymore.

It got a full score in the math competition, reached a new high in real - world programming ability, a new high in human knowledge tests, a new high in multi - modality ability... Anyway, it's all about new highs.

As soon as the press conference ended, global internet celebrities and bloggers rushed in to conduct all kinds of "extreme stress" tests on GPT - 5.

Among them, YouTuber Matthew Berman with 500,000 followers did something big. He showed nearly 30 extremely challenging tasks of GPT - 5 in a 25 - minute video.

For example, he asked GPT - 5 to write a program that could instantly generate, scramble, and even restore a complex 20 - order Rubik's Cube.

Even more impressively, it directly replicated the complete applications of Word and Excel in the web page. Note that it's the functional software, not just drawing a table for you.

From the 3D version of the classic "Game of Life" to a fluid dynamics simulator where you can freely adjust gravity and air resistance, GPT - 5 is really at the "doctoral - student level" as Altman previously described.

However, while developers are celebrating, the situation on the other side is completely different.

The capital market was the first to vote with its feet. On the day of the release, concept stocks generally pulled back. Apparently, they were not completely impressed by this "doctoral student".

After all, people had waited for two and a half years, expecting an earth - shattering revolution. Instead, they only got an expected routine upgrade. Disappointment was inevitable.

Moreover, the current AI competition has entered a brutal "Warring States" era. The technological gap among various players is rapidly narrowing. It's becoming increasingly difficult for OpenAI to outperform its competitors with routine operations.

Amid this hustle and bustle, OpenAI's old rival Elon Musk was the first to oppose. He directly presented the performance data, claiming that GPT - 5 was not as good as his Grok 4.

Not only do the capital and competitors not buy it, but the complaints from ordinary netizens are also increasing.

People's most intuitive feeling is that although GPT - 5 seems to have a higher "IQ", its "EQ" has declined.

Many people reported that when using it to write copywriting and scripts now, the text feels stiff and mechanical, lacking the former flexibility and naturalness.

Some netizens sharply joked: "Emotion and logic are like a seesaw. If you hold down the logic end tightly, the emotion end will fly up, won't it?"

No wonder many netizens are worried that their most convenient GPT - 4.5 will disappear. However, soon some enthusiastic netizens found that there is actually a switch in the settings, and you can still switch back to the previous model.

This makes many people feel that AI seems to have really entered a bottleneck period.

The most controversial aspect is still the programming ability, which the official always hypes up the most.

Many people's first reaction is: "It doesn't seem as good as Claude?" This kind of "Altman - style marketing" sounds great in the promotion, but there is a gap in actual use, which makes many people question the "new highs" on the lists.

However, some netizens reported that when developing a Cantonese learning application and testing several large - scale models, Claude and Gemini all had more or less problems when generating UI and precisely modifying code. Only GPT - 5 steadily completed the task, and the result was surprisingly good.

After all, GPT - 5 is like a genius with a serious academic bias: it is terrible at humanities and can't write warm - hearted text; but in science, especially in fields that require strict logic and complex engineering capabilities, it is still very powerful.

Old Fox thinks that this sentence from netizens sums it up perfectly: It's just not as good as expected... but it's still the most powerful large - scale model at present.

This article is from the WeChat official account

该文观点仅代表作者本人，36氪平台仅提供信息存储空间服务。

GPT-5 is here. Musk: I'm not convinced.