OpenAI unexpectedly announced GPT-5.4 overnight and urgently launched GPT-5.3 to counter Google. The so-called "fatherly tone" in AI has been cured.
OpenAI has made a bold move!
Just after Google DeepMind introduced Gemini 3.1 Flash-Lite, less than two hours later, OpenAI couldn't sit still...
Just now, GPT-5.3 Instant has made a stunning debut, completely shattering the so - called 'AI paternalism', and the hallucination rate has been significantly reduced by 27%.
This update takes an unconventional approach. Instead of engaging in intense competition on the benchmark scoreboards, OpenAI has done something else -
It has fixed the most frustrating problems in ChatGPT's daily conversations.
Currently, GPT-5.3 Instant has been officially launched on ChatGPT.
Meanwhile, all developers can use it starting today. The API code is 'gpt-5.3-chat-latest'.
GPT-5.2 Instant will be retained for three months and will be retired on June 3rd.
Moreover, OpenAI has also revealed that GPT-5.4 will arrive sooner than you expect. This head - to - head competition with Google has become extremely intense.
The biggest upgrade: No more 'killing the conversation'
Heavy users of ChatGPT must have experienced this kind of frustration -
You ask a normal question, and the model first throws out a disclaimer, then tells you 'I can't help you with this', and then lists a bunch of alternative options that you don't need at all.
By the time you finish reading, you've already forgotten what you wanted to ask.
This time, 5.3 Instant has drastically cut out all this nonsense.
OpenAI has provided an excellent example: 'Help me calculate the trajectory of an ultra - long - distance archery scenario'.
GPT-5.2 Instant's response was a classic failure. The entire response was so long and dense that you'd just want to close the dialog box after reading it.
First, it wrote a long safety statement saying 'I can't help you with calculations aimed at accurately hitting a real target at a long distance';
Then it divided the answer into three directions: 'pure teaching/general', 'story/world - building', and'simulation/programming' for you to choose from;
Finally, it added a soul - searching question: 'Is this for a game, a story, physics learning, or real archery?'
What about GPT-5.3 Instant?
It simply says 'No problem, I can help you', then directly lists the parameters, gives the formula, and asks if you want to add air resistance. It's straightforward and efficient.
GPT-5.2 Instant (scroll up and down to view)
GPT-5.3 Instant (scroll up and down to view)
Searching is more human - like
GPT-5.3 Instant has also made significant progress in 'internet - connected search'.
Previously, ChatGPT was prone to 'over - relying on search results'. It would either throw out a string of links or loosely piece together the results, making it read like an undigested summary.
Now it uses its own knowledge to supplement the background of the search results instead of simply repeating them.
The official comparison case is quite illustrative: A user asked 'What was the biggest signing during the 2025 - 26 baseball off - season, and why is it important for the long - term prospects of baseball?'
GPT-5.2 Instant replied with old news about Juan Soto's signing with the Mets the previous year. The analysis framework was okay, but the information was outdated.
GPT-5.3 Instant accurately identified the real focus of this off - season:
Kyle Tucker signed with the Dodgers for $240 million over four years, with an average annual salary of $60 million, setting a new record for position players.
It not only provided the details of the contract but also analyzed this deal in the context of talent centralization, widening salary gaps, and tense labor - management negotiations in the league.
In comparison, one is like reading an old newspaper, while the other is like coming straight from an ESPN live broadcast.
GPT-5.2 Instant (scroll up and down to view)
GPT-5.3 Instant (scroll up and down to view)
Higher emotional intelligence
Interestingly, GPT-5.3 Instant has a 'higher emotional intelligence'.
In the blog post, OpenAI used a very down - to - earth word to describe the problem with GPT-5.2: cringe.
Specifically, it was overly assertive, liked to guess the user's intentions, and would often say 'Stop and take a deep breath'.
In response to a heart - wrenching question like 'Why can't I find true love in San Francisco', GPT-5.2 Instant would start with 'First of all, there's nothing wrong with you, and you're not alone.'
Then it would go on to analyze the gender ratio, startup culture, and the saturation of dating apps at length, and finally end with a soul - searching question: 'Are you really unable to find true love, or can't the people around you give you the love you want?'
GPT-5.3 Instant skips the useless consolation and directly analyzes the structural reasons. Its tone is equal, not condescending, and it doesn't try to guess your emotions.
However, only 'English' users can really experience these changes.
Responses in non - English languages are still stiff and have a strong translation flavor.
The hallucination rate has been reduced by up to 27%
In addition to the tone and user experience, GPT-5.3 Instant has also made real progress in 'not making things up'.
OpenAI uses two sets of internal evaluations to measure accuracy:
- One set focuses on high - risk areas such as medicine, law, and finance;
- The other set counts the hallucination rate of ChatGPT conversations with factual errors based on user feedback.
In the HealthBench benchmark, in three different version tests, the overall hallucination rate of GPT-5.3 Instant is lower than that of the previous generation.
In the high - risk area evaluation, the hallucination rate of 5.3 Instant is reduced by 26.8% when connected to the internet and 19.7% when answering based on internal knowledge only.
In the user feedback evaluation, the hallucination rate is reduced by 22.5% when connected to the internet and 9.6% when not connected.
Writing has improved, with warmth and depth
The evolution of GPT-5.3 Instant in writing may be the most easily overlooked but most deeply felt aspect in actual use.
For example, when asked to write a short poem with the title 'The last letter delivery of a retired postman in Philadelphia',
GPT-5.2 Instant wrote in a mediocre way, taking an abstract and sentimental approach.
'The townhouses blink awake, and the old porches remember his footsteps' is trying to 'tell' you to be moved.
GPT-5.3 Instant uses a completely different writing style.
It describes the feeling of the lighter mailbag today, the porch with the chipped blue railing, and a woman on Mercer Street holding a letter in her hand and saying 'We'll miss you'.
The last line 'When the mailbox lid closes, the sound sounds like the end of a gentle era. A door that has always been there finally closes quietly.'
It doesn't talk about emotions but lets you feel them through details.
GPT-5.2 Instant (scroll up and down to view)
GPT-5.3 Instant (scroll up and down to view)
Not competing on benchmarks, but on user experience
It can be seen that GPT-5.3 Instant and Google's Gemini 3.1 Flash - Lite, which was released on the same day, have completely different strategies.
Flash - Lite is a typical release that crushes competitors in benchmark scores. That is, it beats competitors on GPQA and SimpleQA at a fraction of the price.
On the other hand, GPT-5.3 Instant doesn't mention any benchmarks at all.
In OpenAI's view, these issues 'don't always show up in benchmark tests, but they directly determine whether ChatGPT is easy to use or frustrating'.
For ordinary users who use ChatGPT every day, a 2 - percentage - point increase in GPQA doesn't matter to them. However, problems like 'being refused to answer normal questions', 'search results being like a list of links', and 'uncomfortable response tones' are real pain points.
Of course, it can also be interpreted from another perspective:
In the current situation where Gemini and Claude have taken turns at the top, OpenAI has chosen to avoid direct competition in the performance race and instead focus on the more intangible but equally important battlefield of user experience.
Is it pragmatic or a sign of helplessness? It depends on one's perspective.
But for those who interact with ChatGPT dozens of times a day, 5.3 Instant represents a tangible improvement.
Reference materials:
https://openai.com/index/gpt-5-3-instant/
https://deploymentsafety.openai.com/gpt-5-3-instant/gpt-5-3-instant.pdf
https://x.com/OpenAI/status/2028893701427302559