Welcome OpenAI Back to the Open - Source Large Model Race: Key Points I'm Concerned About

After five years and nine months, OpenAI has released an open-source large model again. Is there anything new this time?

On August 5, 2025, Pacific Time in the United States, OpenAI released two open - source large models, GPT - OSS 120B and GPT - OSS 20B. Currently, both models can be downloaded from the Hugging Face platform, and users can modify, customize, and commercially apply them. Mainstream cloud platforms, including Amazon AWS and Microsoft Azure, have also started providing services based on these two models. This is the first time OpenAI has released open - source large models since November 2019.

History is really ironic. The name "OpenAI" comes from "open" and "open - source", which Sam Altman once boasted as the core spirit and survival path in the AI era. However, since the beginning of 2019, OpenAI has deliberately deviated from the open - source track. In February of that year, under the pretext of "security issues", it refused to publish all the parameter weights of GPT - 2 and only released a "partial model" with 774 million parameters. It wasn't until November when GPT - 2 attracted little attention that it reluctantly published all 1.5 billion parameters. As for the later successful GPT - 3, GPT - 3.5, and GPT - 4 series of large models, neither the parameter weights nor the technical route white papers have been published.

As of yesterday, OpenAI has become one of the few developers in the world's top group of AI large - model basic research and development that "has no new open - source large models". Another one is Anthropic, which has never released an open - source large model since its establishment. Considering that Anthropic was founded by former employees dissatisfied with OpenAI, it really verifies the saying, "Birds of a feather flock together."

Among its competitors, Google has maintained the open - source Gemma series of large models since 2024, advancing side by side with the closed - source Gemini series. Meta's LLaMA series of large models is the spiritual origin of today's mainstream open - source large models, needless to say. The first version of Mistral from France has an open - source version. Elon Musk's Grok also released an open - source large model at its inception. Alibaba's Qwen has become one of the open - source large models with the most derivative versions. Not to mention DeepSeek, if it weren't open - source, it would never have achieved such a large influence and application scope.

Some people will surely ask, Why open - source? For competitors, open - sourcing is of course a good thing, facilitating mutual learning and reference (and even plagiarism). For all of humanity, open - sourcing is also a good thing because history has repeatedly proven that openness promotes technological progress. But for a leading developer like OpenAI, why open - source? Although open - sourcing will attract more attention from the technical community and help form a good ecosystem, GPT is already the most well - known large model in the world. What practical significance does open - sourcing have? (Apart from justifying itself and getting rid of the label of "CloseAI"?)

The answer is clear: Open - source large models can be downloaded and installed on local hardware devices and run entirely locally, which is quite attractive to some customers. Let's summarize:

Customers can store all their data locally instead of uploading it to third - party platforms, thus maximizing data security. This level of security is crucial for both national and commercial secrets.

Customers can fine - tune open - source large models according to their own needs to fit specific industry application scenarios. Industries such as healthcare and finance, which are complex or sensitive, have particularly strong demand for this.

For customers with limited budgets, running large models on local hardware may be more cost - effective than purchasing the right to use closed - source large models. For example, GPT - OSS - 2B can even run on a laptop.

Of course, deploying open - source large models locally means that customers are responsible for their own information security and technical maintenance. After weighing the pros and cons, many large - scale industry customers still prefer open - source large models. This is why the LLaMA series of large models is very popular among large enterprises in Europe and the United States, and also why DeepSeek swept domestic government and enterprise customers at the beginning of this year. DeepSeek's technical level may be comparable to that of GPT - 4o1, but if it weren't open - source, its application speed would be extremely slow, both for the B - side and the C - side!

Now, after nearly six years' absence, OpenAI has finally returned to the battlefield of open - source large models. To some extent, it must have been stimulated by open - source large models such as LLaMA, DeepSeek, Qwen, and even Grok. But from a business perspective, this decision was inevitable. After all, some enterprise customers will never upload their crucial data to third - party platforms, and government departments are even less likely to do so. Instead of leaving this vast market to competitors, it's better to capture it on its own. If the technological progress of competitors were slower, OpenAI's return to the open - source track might also be slower, but only slightly.

This also means that 2025 has become a "year of open - source": Baidu, which was once leading in China, and OpenAI, which is still leading abroad, have both released open - source large models. Meta released the latest open - source version, and Alibaba has significantly accelerated the release of open - source versions. At this moment, among the world's mainstream large - model developers, only two have no open - source versions at all. Besides Anthropic mentioned above, there is also ByteDance in China. Doubao large model (and its predecessor Yunque) currently has no open - source version in any form, and ByteDance has not announced any open - source plans. However, from a purely technical perspective, Doubao is not in the world's top group, and whether it is open - sourced or not has little impact on the technological progress of large models.

Let's discuss the next topic: What impact will OpenAI's open - sourcing have on the global large - model technology? I'm not a technical developer, so I can only talk from a common - sense perspective. My view is that there is an impact, but it is limited. On the one hand, OpenAI has not open - sourced its latest version and latest technology (naturally, you wouldn't either). On the other hand, the "guesses" about OpenAI's technical route in the past two years have been relatively successful, quite accurate.

The training data of the two versions of GPT - OSS announced by OpenAI ended in June 2024, and the training was completed in August 2024. Their performance is roughly comparable to that of GPT - 4o3 and o3 mini, which have been released for four months. Many evaluations point out that GPT - OSS - 120B performs better than the latest versions of DeepSeek and Qwen. In fact, this doesn't provide any new information because GPT - 4o3 already outperforms them. This only proves that OpenAI still has a lead of at least a few months over its competitors, which is something we already knew.

In terms of the technical route, from OpenAI's own white paper, we can roughly know the following information:

GPT - OSS adopts a mixture - of - experts architecture, which has long been guessed by the outside world. The mixture - of - experts architecture is currently the mainstream, and almost all large models are using it. GPT - OSS 120B has 128 experts per layer, and GPT - OSS 20B has 32 experts per layer. Each path will activate 4 experts most proficient in answering. These details are still useful.

GPT - OSS is trained on standard text. The chain - of - thought (CoT) architecture is implemented in the post - training stage rather than the pre - training stage. CoT is the basis of so - called "deep - reasoning" large models. Now it is certain that, like its competitors, OpenAI endows CoT in the post - training stage.

In the post - training stage, like GPT - 4o3, GPT - OSS adopts CoT RL technology. External APIs and RAG Agents are also used in the post - training process, and I won't go into details here. To some extent, this confirms the outside world's guesses.

OpenAI did not choose to suppress the "large - model hallucinations" in the post - training stage because doing so would reduce the transparency of CoT. Therefore, the hallucination rate of GPT - OSS in the deep - reasoning mode is very high, which may be an unavoidable problem for all deep - reasoning models.

In general, most of the above technical routes have either been guessed by the outside world or are under debate. Some technical details, such as the specific means and tools of post - training, may inspire the outside world, but the improvements they bring are limited. After all, if OpenAI really has any "secret recipes", it probably won't blatantly publish them in the white paper. This white paper proves one thing: In the past two years, most of the guesses and imitations of OpenAI's technical route by global large - model developers were correct (or rather, OpenAI only admitted the correct part). As a whole, the power of human imitation is infinite. Therefore, in history, few technological leaders have been able to maintain a long - term monopoly on leading technologies solely by their own strength.

It should be emphasized that GPT - OSS is only an "open - weight" large model, not a fully "open - source" large model in the true sense. It only publishes the parameters and their values (weights), a 34 - page technical white paper, and a small amount of other selective information. If we really want to "replicate" a finished product using the same means, at least the following links are missing:

Various "scaffolding models" used in training, including corpus quality, corpus similarity detection, corpus cleaning models, and the Reward model for "aligning" human values, etc. Some competitors will partially publish them, but OpenAI has not.

The corpus used in the pre - training stage is a core technological secret, especially when the amount of corpus used in large - model training is increasing and high - quality corpus is becoming harder to find. Meta once partially published the corpus used by LLaMA, while OpenAI has not.

Other tools used in the training process. If they are standardized tools, it's okay. But if they are exclusive tools, even if their names are disclosed, the outside world can't imitate them.

Large models that fully meet the above "open - source" conditions are very rare. Especially for commercial companies, it is almost impossible to release such "fully open - source" large models. The reason is simple: Everyone releases open - source large models to meet the needs of some customers and cultivate a developer ecosystem, not to facilitate others' plagiarism. The information provided by OpenAI this time is valuable but not sufficient, which is probably the effect it wants to achieve. This reminds me of the prospectuses of some technology giants, which are hundreds of pages long and seem to provide a large amount of financial and business information. However, when it comes to key user and technology issues, they avoid them in various ways. I won't name names here.

Incidentally, OpenAI announced the training details of GPT - OSS. Based on NVIDIA H100 GPUs, the 120 - billion - parameter version consumed 2.1 million H100 hours, and the 20 - billion - parameter version consumed one - sixth of that. From this, we can infer the scale of the computing - power cluster used for GPT - OSS training. Assuming the training time is 30 days, 2,917 H100s were used. If it is 45 days, 1,944 H100s were used. Considering that the training data ended in June 2024 and was completed at the end of July or the beginning of August, the training time is unlikely to significantly exceed 45 days.

Therefore, GPT - OSS has not used the latest Blackwell series of GPUs, nor has it used a "ten - thousand - GPU cluster" or a larger - scale cluster. Does this mean that the computing - power requirements for training top - level large models are not that high? Don't jump to conclusions because GPT - OSS is not OpenAI's main model. It is just one of the countless models trained internally by OpenAI. The parameter scale of GPT - 4 is as high as 137 trillion, more than ten times that of OSS, and its computing - power requirements will definitely be much higher. The precious B100/200 GPUs may be completely used for the training of GPT - 4.5 and GPT - 5. Unfortunately, OpenAI is unlikely to disclose the training details of these two models.

I guess that GPT - OSS may be one of the last few large models trained by OpenAI using the Hopper architecture GPUs. Large models after GPT - 4.5 may be completely trained based on Blackwell. But this is just my guess. As for the H100s that are no longer used for training, they will be used for inference tasks. After all, the popularization of deep - reasoning models means a huge explosion in inference demand. Whether the Scaling Law still holds or not, the world's computing power probably needs to increase by 3 - 4 times to meet the booming training and inference needs.

This article has not received any funding or endorsement from OpenAI or any of its competitors.

The author of this article does not hold any shares of OpenAI, nor does he directly hold any shares of its competitors. However, it is almost inevitable to hold shares of its competitors through funds, trust plans, etc.

This article is from the WeChat official account "Internet Phantom Thief Group" (ID: TMTphantom). Author: Pei Pei, the leader of the Phantom Thief Group. It is published by 36Kr with authorization.

该文观点仅代表作者本人，36氪平台仅提供信息存储空间服务。

Welcome OpenAI back to the open-source large model race. Let me talk about some key points that I'm concerned about.