Google Unveils World Model for the First Time

The prototype will be first made available to Google AI Ultra subscribers in the United States.

This prototype is initially available to Google AI Ultra subscribers in the United States.

If the progress in the field of artificial intelligence can be seen as a symphony, then in the past few years, the theme of the movement has undoubtedly been "generation" - generating text, images, sounds, and even videos. However, in early 2026, a brand - new melody was played: it not only generates but also constructs.

At around midnight on January 30th (Beijing time), Google DeepMind opened Project Genie to the public. It is regarded as one of the most advanced world models currently available. It can be considered an experimental research prototype of the world model Genie 3, and this is the first time that this set of world models has been made available to the public in an interactive form.

The word "Genie" comes from the Arabic word "jinni" (spirit). After being transformed through French into "génie", it became an English word. Its most common meaning refers to a "spirit" or "djinn" in Arab and Islamic mythology that can fulfill the wishes of the summoner. Google DeepMind named its world model project "Project Genie" to interpret the connotation of this myth: this AI model can instantly generate a virtual world that you can enter and interact with, based on any scenario you describe in words (the summoner's wish).

When AI can not only depict dreams but also allow people to enter and interact with them, perhaps it's time to rethink the boundary between the "virtual" and the "real" that we've been discussing.

Currently, this prototype is initially available to Google AI Ultra (USD 125 for 3 months) subscribers in the United States who are 18 years old or above.

What makes Project Genie different?

The underlying technology of Project Genie is the world model Genie 3. Different from large content - generation models like OpenAI's Sora, its functionality is not limited to multi - modal content generation (for example, an AI video - generation tool can create a video for users, and all the data the model can rely on and reference comes from pre - stored text, image, and video libraries by humans). Instead, it can generate a complete space, achieving the feat of "creating a world out of nothing":

Just describe a scenario in words or upload a picture, such as "a marshmallow castle surrounded by a chocolate river", and a real - time, interactive 3D virtual world will be generated within seconds.

Users can command characters to walk, fly, or drive freely within it, just like playing a video game, and explore this imagined world.

The surrounding environment is generated dynamically and continuously according to the user's perspective and actions. This does not rely on traditional game engines to decode fixed data but is an immediate deduction and manifestation of potential physical laws and spatial logic. It can generate the path and environment ahead in real - time as the user moves.

Technically speaking, the core of a world model is to simulate the dynamic changes of the environment, predict the evolution of the environment, and the impact of actions on the environment.

Google DeepMind has extensive experience in developing AI agents for specific environments such as chess and Go. However, to achieve artificial general intelligence (AGI), the system must be able to understand and handle the almost infinite complexity and diversity of the real world.

Genie 3 is a crucial step in this direction. It offers an unprecedented simulation ability to generate interactive environments for any real or fictional scenario. This provides a powerful tool for fields such as robotics, animation production, and virtual exploration of historical scenarios.

For the development of AI, the significance of Project Genie goes far beyond a cool experience. Its core value lies in providing an infinite, safe, and cost - controllable "simulation training ground" and "trial - and - error sandbox" for AI agents (and future robots). Agents can learn and train in the vast and diverse simulated environments created by Genie, understand the physical rules and causal logic of the real world, which is an indispensable cornerstone for achieving AGI.

From this perspective, the world model is not just a content - creation tool but a bridge connecting current AI with future "embodied intelligence" and a key infrastructure for AI to learn "common sense" and "causality".

AI academics and tech giants are vying for a share

AI pioneers almost unanimously believe that world models are crucial for building the next - generation of artificial intelligence. Many have stated that this technology will ultimately contribute to the creation of AGI that surpasses human intelligence.

Fei - Fei Li, a professor at Stanford University and the so - called "godmother of AI", founded the world - model startup World Labs. According to insiders this month, Fei - Fei Li is in new rounds of talks with investors, and the latest valuation of the company is expected to reach approximately $5 billion. Earlier reports indicated that Yann LeCun's, the so - called "godfather of AI", world - model startup AMI Labs attracted potential supporters including Cathay Innovation in a round of financing. This round of financing could value the former Meta chief AI scientist's company at $3.5 billion. Jensen Huang, the CEO of Nvidia, has long stated that world models can help achieve "physical AI" to autonomously control robots, self - driving cars, and other devices. Meta's super - intelligent AI lab is collaborating with its robotics team to build a world model. By simulating the physical laws of the real world, it aims to provide robots with spatial perception and fine - manipulation capabilities to make up for the deficiencies of existing robots...

Of course, as an early - stage result, world models represented by Project Genie are still very immature. Taking Project Genie as an example, the time for each generation and exploration is strictly limited to within 60 seconds. The generated world may not be very realistic in terms of physical effects, sometimes failing to precisely follow the prompt words or the physical laws of the real world. There are also often issues such as delays or inaccurate responses when controlling characters. In addition, some advanced functions mentioned in early demonstrations, such as changing world events through instant commands, have not been implemented in this version.

Some of these limitations stem from the huge computational consumption of world models, which is also the core contradiction faced by current AI model technologies. DeepMind researchers admit that every time a user uses it, a dedicated computing chip serves them. Every seemingly effortless "world - creation" relies on the full - power operation of a dedicated computing chip. This means that at this stage, it is more like a narrow window for peeking into the future rather than a door that can be freely accessed.

The gaming industry may be the first to embrace world models

Leading AI teams such as Google DeepMind and World Labs believe that world models may first reshape the gaming and film industries.

Traditional 3D asset creation and scene building are labor - and time - intensive core processes. Project Genie demonstrates the possibility of compressing parts of the pre - production concept design, scene prototype construction, and even dynamic storyboard preview processes into just a few minutes or even seconds. This is not about replacing professional creative engines but may reshape the starting point of the creative process, greatly accelerating the speed of creative verification.

Shlomi Fruchter, the co - leader of DeepMind's world - model project Genie 3, previously said, "Software development, especially game development, is undergoing significant changes. I expect these changes to be even more radical in the next few years."

At the end of last year, World Labs officially launched its first commercial product, Marble, a generative - AI - driven 3D world - generation system. Fei - Fei Li said that this technology will impact game engines such as Unity and Epic's Unreal. "All of this will be disrupted. It's really time for simulation - based game engines to be upgraded."

In addition to the gaming field, companies like xAI and Nvidia also hope to embed world models in robots and self - driving cars.

This article is from the WeChat official account "Science and Technology Innovation Board Daily". Author: Song Ziqiao. Republished by 36Kr with permission.

该文观点仅代表作者本人，36氪平台仅提供信息存储空间服务。

Google opens the world model for the first time.

What makes Project Genie different?

AI academics and tech giants are vying for a share

The gaming industry may be the first to embrace world models