HomeArticle

Has China's self-developed Matrix-3D already achieved ahead of schedule the "world model" that Fei-Fei Li is betting on?

新智元2025-08-12 11:48
Kunlun Wanwei Matrix-3D: Generate an explorable 3D world from a single image, competing with World Labs.

China's self-developed world model Matrix-3D can generate a freely explorable 3D world with just a single image. Not only does its effect match that of Fei-Fei Li's World Labs, but it also enables a larger exploration space, leading the way into the forefront of AI's understanding of the world.

One flower, one world; one leaf, one Bodhi.

For thousands of years, humans could only imagine the world beyond the pictures they drew. There has always been an untouchable veil between dreams and reality.

Today, as the power of AI is infinitely extended, this veil is finally lifted -

Matrix-3D, a true world model that can create countless scenarios from a single image!

It is not only the first world model of Kunlun Wanwei but also a brand - new upgrade of the first fully self - developed world model "Matrix - Zero".

The evolved world model Matrix - 3D can start from a photo of a mountain meadow and create a panoramic view with swaying grass and rolling distant mountains.

Starting from a corner of a modern city, it can "imagine" the bustling streets and buildings beyond the picture.

Now, we no longer need multiple views, nor are we limited to partial perspectives. Instead, we have truly achieved a 3D world with precise geometric structures that can be freely roamed in 360°.

It is worth mentioning that this week is also the eventful AI technology release week of Kunlun Wanwei, and Matrix - 3D is the second model to appear.

Challenging the Core Pain Points of Spatial Intelligence

The large - model track has been highly competitive for two years. Everyone is waiting to see where the next breakthrough direction lies.

Among them, World Labs, which achieved a valuation of one billion in just three months under Fei - Fei Li, may prove that: World models with spatial intelligence are precisely the next frontier for AI to understand the world.

Recently, Google's released Genie 3 has once again filled everyone with expectations for "world models". It can generate 720p images in real - time at a speed of 20 - 24 frames per second and maintain consistency for several minutes.

As an exploration, Kunlun Wanwei also released its self - developed Matrix - Zero world model in February this year:

  • It can not only transform the pictures input by users into real and reasonable 3D scenes that can be freely explored;
  • but also generate interactive video effects in real - time based on user input.

The newly released Matrix - 3D has, for the first time, the ability to "enter the real world from a single image", allowing the world model to evolve once again:

  • Global scene consistency: Supports 360° free - view browsing, with accurate geometric structures, natural occlusion relationships, and unified texture styles.
  • Large - scale scene generation: Compared with existing scene generation methods, it supports the generation of larger - scale scenes that can be freely explored in 360 degrees.
  • Highly controllable generation: Supports both text and image input. The results are highly consistent with the input and support custom - defined ranges and infinite expansion.
  • Strong generalization ability: Based on self - developed 3D data and video model priors, it can generate a rich and diverse range of high - quality scenes.
  • Fast generation speed: The first feed - forward panoramic 3D scene generation model that can quickly generate high - quality 3D scenes.

Technical report: https://github.com/SkyworkAI/Matrix-3D/blob/main/asset/report.pdf

Project homepage: https://matrix-3d.github.io/

Github: https://github.com/SkyworkAI/Matrix-3D

Hugging Face: https://huggingface.co/Skywork/Matrix-3D

Next, let's intuitively feel the "power" of Matrix - 3D.

Image Consistency

First of all, both the generated content and colors can be unified and consistent.

Secondly, in terms of perspective, Matrix - 3D can support 360° free - view around.

An anime - style village with a house with a thatched roof, a windmill, and a flower field stretching to the far end of the horizon. It is extremely detailed, with warm light and a comfortable atmosphere.

In addition, the geometric and occlusion relationships between objects can also conform to physical laws.

An impressionist - style winter landscape, including mountains, lakes, cottages, trees, and snow. It is mainly in blue tones, with rich brushstroke textures, a quiet atmosphere, high resolution, and bright colors.

The panoramic video generated by Matrix - 3D is as follows:

And the final rendering result of the 3D scene looks like this:

A block - pixelated landscape, including mountains, trees, water bodies, sky, and clouds. It is similar to the style of "Minecraft", with high resolution, bright colors, rich texture details, and a quiet atmosphere.

Precise Control

In the 3D world, our perspective usually moves freely along different paths in various directions.

For these different trajectories, Matrix - 3D can generate corresponding 3D scenes.

For example, moving along an S - shaped bend:

Or, moving forward to the right:

Large - Scale Movement

Compared with Fei - Fei Li's World Labs method, Matrix - 3D supports a larger - scale movement.

As can be seen in the video released by World Labs, "we" hit the boundary after just taking a few steps.

Similarly, Hunyuan World 1.0 also has problems in generating the edges.