OpenAI-Mitarbeiter geben preis: Sie haben bereits GPT-5 ausprobiert. Es wird im Juli erscheinen und scheint vollständig multimodal zu sein.
[Introduction] Has GPT-5 already been used by OpenAI employees? Just today, Altman followed a mysterious person on X, sparking speculation across the internet. Not only have two people reported that they may have had an early experience with GPT-5, but some netizens also seem to have been included in the gray-scale test. The upcoming GPT-5 this summer has already caused a frenzy across the internet!
Just today, the discussion about GPT-5 has heated up again, with mysterious revelations flying all over X.
The reason is that Sam Altman followed a person named Yacine on X.
This person said that he had just tried out a large model of an AI company, and the experience was very shocking. He bets that no one can predict what kind of storm is coming ahead.
Another person named "Aidan" also commented under this post, saying that he had the same experience.
Many people speculate that what they tested was GPT-5.
The reason is that Aidan is an employee of OpenAI, and Yacine was just fired by xAI but suddenly got Altman's attention. It can't be a coincidence that they both said so.
There is a high possibility that they have obtained early access to GPT-5.
Even, what they saw must be very amazing. This might be the moment before the internet collapses.
Additionally, an insider said that Yacine has been considering starting a startup. Now that Altman has followed him, maybe he plans to recruit him to OpenAI?
In short, the whole internet has once again fallen into a craze of discussing GPT-5.
Is GPT-5 already in the gray-scale test?
Actually, it's no wonder that netizens are suspicious because more and more people have shared their experiences of seemingly being included in the gray-scale test of GPT-5.
For example, this netizen found that when using OpenAI's model, he was gray-scaled to a brand-new AI.
Without any prompts, it could think continuously for 3 minutes and conduct a large number of searches at the same time.
Also on the 26th, another netizen found that if the selected model was 4o, ChatGPT would start to think. This makes people wonder if OpenAI is quietly transitioning to GPT-5.
GPT-5 to be released this summer
Previously, in the OpenAI podcast, Altman was quite certain about the release time of GPT-5 - "possibly sometime this summer".
Just a week ago, Altman also appeared at the AI Startup School event hosted by YC in San Francisco.
He revealed in an interview: GPT-5 will move towards full multimodality!
Specifically, GPT-5, expected to be launched this summer, is a multimodal model that supports multiple input methods such as voice, image, code, and video.
GPT-5 will not fully realize OpenAI's ultimate vision for future models, but it will be an important step in the process.
The ultimate vision of the GPT-5 series of models is a fully multimodal integrated model.
It will have deep reasoning ability, be able to conduct in-depth research, generate real-time videos, write a large amount of code, create brand-new applications for users instantly, and even render real-time videos for user interaction.
When all this is achieved, it will bring a brand-new computer interface - almost "disappearing" and becoming imperceptible.
Even earlier, in February this year, Altman also posted on X, saying that one of OpenAI's major goals is to unify the o-series and GPT-series models by creating a system that can use all tools and know when to think for a long time or not, so that it can handle a wide range of tasks.
The GPT-5 model will be released in ChatGPT and the API, integrating functions such as voice, canvas, search, and Deep Research.
Netizens also have many predictions about GPT-5. Many people think that it will be the first true hybrid model that can dynamically switch between reasoning and non-reasoning during the response process.
In summary, its key features are multimodality, a context of 1 million tokens, reasoning + memory, fewer hallucinations, and the integration of the o-series and GPT models.
It can be said that it is the future of intelligent agents.
Some people also predict that the improvements of GPT-5 will mainly focus on the following aspects.
- The video modality is more "native" and the input is more natural;
- The performance of the intelligent agent has been improved by at least 50%, thanks to the in-depth use of reinforcement learning;
- It has stronger understanding ability and intuition, especially in the ability to execute tasks in a chain or combine multiple learned behaviors into more complex tasks;
- A hierarchical structure (Hierarchy) may appear;
- Instead of just the "trick" of "selecting the appropriate model", it will have an architecture like VLM-VLM, using small and fast VLMs to replace large VLMs to improve generality, speed, and responsiveness.
However, an OpenAI employee revealed that internally, they are only about two months ahead of the publicly available models at most. So, GPT-5 won't have a huge leap but only a slight improvement. The difference is that it will be integrated with many tools.
Just a month ago, Michelle Pokrass, a core researcher of GPT-4.1, revealed the progress of GPT-5.
She said that the challenge in building GPT-5 lies in finding the right balance between reasoning and chatting.
She said, "o3 thinks seriously but is not suitable for casual chatting. GPT-4.1 improves coding ability by sacrificing some chatting quality."
"Now, the goal is to train a model that knows when to think seriously and when to chat."
At the same time, she also introduced for the first time more about the development process behind GPT-4.1 and the key role that RFT plays in the product. For example, in improving model performance, GPT 4.1 focuses on long context and instruction following.
Additionally, fine-tuning technology plays an important role in GPT 4.1. The emergence of RFT (Reinforcement Fine-Tuning) brings new possibilities for expanding model capabilities. Compared with traditional SFT, RFT shows strong advantages in specific fields.
Altman's interview with the core team: Pre-training GPT-4.5
In April, Sam Altman's interview with the core technical team also revealed some "knowledge" about the pre-training of GPT-4.5.
In the interview, it partially answered why "pre-training is compression" can lead to general intelligence?
Indigo posted, saying that the core of wisdom lies in that learners gradually capture the structural nature of the world through compression and prediction and internalize it as knowledge.
1. Solomonoff inspiration
The interview mentioned a concept: Solomonoff Induction:
Among all the "programs" that can describe (or explain) the data, the simpler the program, the greater the prior probability. It can also continuously update the interpretation of the data in a Bayesian way.
In a language model, every successful prediction of an additional character or word means that it has found some internal structure in the training data.
2. More "correct compression" means deeper understanding
The interview also emphasized many times that in multi-domain and multi-context data, when the model repeatedly predicts (i.e., searches for the "optimal compression"), it will gradually learn cross-domain abstract concepts and associations.
This is what people often call "emergence" or "general intelligence".
3. Complementary relationship between pre-training and subsequent "fine-tuning/inference" strategies
Pre-training + targeted supervised fine-tuning (or reinforcement learning) can make the model more accurate in certain reasoning, logic, or