HomeArticle

AI Battle Day: OpenAI Goes Open-Source, Claude Unveils Its Most Powerful Coding Model, and Google Showcases Its Stunning World Model

硅星人Pro2025-08-06 12:12
Sit in rows and take what you need.

The three most important model giants in Silicon Valley released their models of great milestone significance on the same day. It's been a while since we last saw such a chaotic battle day.

August 5th is destined to become an important moment in the evolution of AI technology and the business competition landscape.

On the same day, Google first launched the Genie 3 model - a world model that allows you to interact with the 3D world generated by the model in real - time. Then Anthropic directly updated its most powerful Claude Opus series and released Claude 4.1 Opus, with its coding ability continuing to break through. Finally, the long - awaited open - source model from OpenAI finally arrived. As previously leaked, OpenAI released a model named GPT - oss with open weights. This is its second open - source of a language model after GPT - 2.

The releases of the three models occurred one after another within 24 hours. However, different from the past direct and fiery competition, this time each company is mostly demonstrating different evolution directions in their respective areas of expertise. The narrative of AI is shifting from the single dimension of "whose model is stronger" to a more complex and diversified competition landscape.

OpenAI GPT - oss: The Delayed "Open - Source" and Smart Positioning

OpenAI finally submitted its open - weight model assignment: GPT - oss, a dense model with 13 billion parameters. This is not a state - of - the - art (SOTA) model that can rival GPT - 4o or Claude 4.1. Its performance is roughly comparable to that of Llama 3 8B or Qwen2 7B. In some benchmark tests, its performance is even slightly inferior to that of its peers of the same scale.

But its significance lies not in performance, but in the name "OpenAI" and its attached license.

First of all, it should still be noted that this is not a complete open - source.

GPT - oss uses the "OpenAI Model License 1.0" customized by OpenAI. The most crucial clause in it prohibits any commercial entity with an annual revenue of over $100 million or more than one million daily active users from using GPT - oss to develop or provide services that compete with OpenAI's core products (such as APIs and ChatGPT). This "poison pill" clause precisely excludes all potential large - company competitors, while allowing a large number of small and medium - sized developers and researchers to enter its ecosystem.

Secondly, this is OpenAI's first open - source of weights since GPT - 2, representing a major strategic shift. It is no longer just the high - and - mighty closed - source leader. Instead, it tries to attract developers to its ecosystem through a "sufficient" open - source model - using GPT - oss for local development and fine - tuning, and then seamlessly migrating to more powerful closed - source OpenAI models.

Looking back at why OpenAI released an open - source model, it all stems from the impact of DeepSeek. When a free and open - source model reaches the level that its high - priced closed - source model provides for most of its users, it is a fatal blow. Today's GPT - oss is a form of defense for OpenAI and an action for its ecological expansion, aiming to counter the erosion of its developer base by open - source forces such as DeepSeek and Qwen.

Google Genie 3: From Generating the World to "Playing" in the World

GPT - oss is more of a product of business strategy, while Google's Genie 3 released on the same day brings more technological imagination.

Genie 3 is defined by the over - used term - a "world model", but it takes a step further. It is no longer satisfied with generating videos or 3D assets, but directly generates an interactive 3D world.

Give it a picture, a text description, or even a sketch, and Genie 3 can create a 3D environment with a consistent style and in line with physical logic, and allow you to act and interact in it in real - time. It can understand natural language instructions such as "walk to the left" and "jump up" and instantly render the corresponding first - person perspective images.

This is achieved through an architecture called "Spatio - Temporal Video Transformer" (SVT). Genie 3 was trained on over 200,000 hours of publicly available game videos (mainly 2D platform games) and learned to understand the causal relationship between actions and the world. It can not only generate the world but also infer the behavior patterns of different objects and characters in the world. For example, it can keep a tree in the details consistent in different scenarios.

This means that for the first time, AI has the ability to create a virtual space for "play", providing a highly shocking prototype for game development, simulators, robot training, and even the realization path of the metaverse.

Google's Genie 3 has received almost unanimous exclamations. Two senior research scientists from NVIDIA, Jim Fan and Phillip Isola, both expressed their shock. Isola called it "crazy", while Jim Fan described it as "a quantum leap".

This AI, which can envision an entire interactive game world from a single picture, has internalized intuitive knowledge of the physical world by learning a large number of videos. It may be a major step towards general - purpose robots.

Claude 4.1 Opus: The "New God" for Programmers

Anthropic continues to strengthen its sharpest "spear". The newly released Claude 4.1 Opus has a clear goal - to become the strongest programming assistant.

According to official data, in the HumanEval+ benchmark test that measures code generation, debugging, and logical reasoning abilities, Claude 4.1 Opus scored an astonishing 85.2%, surpassing the previous record of 84.9% set by GPT - 4o for the first time. In the internal Agentic Coding evaluation, its problem - solving ability has also nearly doubled compared to the previous generation.

In addition to stronger capabilities, Claude 4.1 is also faster and cheaper. For developers and enterprise users, this means that the efficiency and cost - effectiveness of AI coding in the actual workflow have been substantially improved. Anthropic has still chosen the most practical and profit - oriented path, which has now become its moat.

The performance of these models, especially OpenAI's open - source model, in the real environment will be the focus of the industry's attention in the future. We will also continue to conduct actual evaluations of these models. Upon closer inspection, this "chaotic battle day" is very different from the past. The three models are not directly "sniping" at each other. Instead, they seem to come together to attract the most attention.

On the one hand, it shows that the approach of OpenAI, such as releasing similar models at the same time to suppress competitors, is becoming increasingly difficult today. GPT - 5 is no longer a model whose training completion node can be completely determined by the development team. Instead, it is more like an experimental research that requires waiting for many variables to finally mature. When your "killer weapon" cannot be in place on time, the stage of competing solely by "strength" is over, and strategies become important. It is an inevitable choice for OpenAI to use a strategic "open - source" product to consolidate its position.

On the other hand, and more importantly, today's important companies in Silicon Valley have begun to have a "clear division of labor".

Anthropic's Claude is truly "far ahead" in programming and has benefited from it and intends to continue to consolidate this advantage; OpenAI is in an unprecedented period of turmoil. It is investing more energy in building a complete ecosystem to maintain its still - existing but not very significant first - mover advantage, while hoping to wait for the maturity of GPT - 5. With this combination of moves, it aims to stabilize the morale of its team, maintain its valuation, and continue to tell its story; Google, after catching up with the first - tier in core LLM capabilities, obviously starts to play the role of the creator of the "next Transformer moment". From VEO3 to Genie 3, it is investing resources that others cannot or are unwilling to invest in, betting on the breakthrough of the next paradigm.

The progress of models has not stopped, and the AI world has become even more lively.

This article is from the WeChat official account "Silicon Valley Stars Pro", author: Gemini. Republished by 36Kr with permission.