Feifei Li's World Model "Killer Feature" Open-Sourced: Instantly Open Large 3D Web Scenes and Run 100 Million Point Clouds Smoothly on Mobile Phones

3DGS ushers in an epic upgrade.

On April 15th, Zhidx reported that today, the World Labs team of "AI Godmother" Fei-Fei Li has open-sourced the Dynamic 3D Gaussian Splats (3DGS) Renderer Spark 2.0.

▲ Announcement of Spark 2.0's open source (Source: X)

Fei-Fei Li herself commented at the moment of the release of this achievement: "Spark 2.0 can now stream over 100 million Gaussian splats on any device! We are extremely proud to contribute to the open-source ecosystem for web-based 3DGS rendering!"

▲ Fei-Fei Li's comment (Source: X)

The Spark series of models was first released last year. It is a dynamic 3D Gaussian Splats (3DGS) renderer specifically built for the web. It integrates with the most popular 3D framework on the web, THREE.js, and uses WebGL2 to run on any device with a web browser, including desktops, iOS, Android, and VR devices.

Compared with the previous version, Spark 2.0 has added a set of Level of Detail (LoD) systems, which can stream and render ultra-large-scale 3DGS worlds on any device.

▲ Freely explore in a children's room with clear details of items (Source: World Labs blog)

In addition, the new version uses the .RAD 3DGS file format, which supports progressive refinement streaming. The Virtual Splats Paging System enables access to an infinite splat world through fixed GPU memory allocation. In simple terms, it can render infinitely large 3D scenes.

▲ A cave cottage in the grassland with no distortion during scene transition (Source: World Labs blog)

How is such a smooth and coherent effect achieved? To address the challenge of scaling large-scale scenes, Spark 2.0 employs three graphics and system-level solutions: level of detail optimization, progressive streaming loading, and virtual video memory management.

The Fei-Fei Li team elaborated on the three technologies behind Spark 2.0 in great detail in their blog, as follows:

01. Adopt Continuous Level of Detail to Stably Render Millions of Splats

In computer graphics, a level of detail system is often used when dealing with large 3D scenes. This system automatically adjusts the level of rendering detail based on the distance between the object and the observer.

Different level of detail methods fall between discrete and continuous, forming a technical spectrum. When using the discrete Level of Detail (LoD) method, the system needs to create multiple versions of the splat effect, increasing from simplified to detailed. Then, based on the approximate boundaries of each version and the distance from the camera, it switches between different versions.

Spark's early system design supported the discrete mode, but it had obvious flaws: when the user moves in the scene and different versions suddenly switch, the picture will show obvious jumps; in addition, after grouping the splat effects by blocks, the user can also see clear boundary traces.

Spark 2.0's LoD design uses a continuous LoD method. All splats exist in a hierarchical structure, namely the LoD splat tree. Spark 2.0 will select splats individually along a boundary cutting plane of the tree to optimize the splat details within the viewport.

▲ LoD splat tree (Source: World Labs blog)

Each internal node in the tree is a low-resolution version of its child nodes. By merging multiple splats of the child nodes into a new splat, it approximately represents the shape and color of the child nodes' splats. This process continues until the root node of the tree - a single, large splat that aggregates the overall shape and color of all splats in the object.

Using this LoD splat tree, Spark 2.0 calculates a "slice" through the tree to select the best N splats for the current viewport for rendering. By setting a maximum splat budget N (usually between 500,000 and 2.5 million splats depending on the device type), the system ensures that only a constant number of splats need to be rendered per frame, thereby achieving stable and high-frame-rate rendering performance. By adjusting the value of N up and down, a trade-off can be made between frame rate and splat details.

▲ A bicycle in the park with realistic details and strong consistency (Source: World Labs blog)

Spark 2.0 further extends this algorithm by traversing multiple LoD splat tree instances simultaneously. Different from starting the traversal from a single root node, for each 3DGS object, the extended algorithm adds its screen size and splat node (dm0, Sm0) to the initial priority queue. The subsequent process remains the same as the original logic, and it can synchronously screen the level of detail to be refined on all 3DGS objects in the scene.

This design makes the creation of large-scale combined worlds simple and efficient: Just add 3DGS LoD objects at any position in space, and Spark 2.0 can automatically calculate the optimal global subset of all LoD splats to be rendered per frame.

02. Design a New File Format to Open Large-Scale 3D Worlds on the Web Instantly

Spark 2.0 defines a new file format .RAD (representing RADiance field). This format can compress 3DGS data and support random access streaming, thereby achieving progressive refinement when the data is transmitted over the network.

The two most common 3DGS data file formats currently are.PLY and.SPZ, which represent two different data encoding methods: row storage and column storage.

.PLY files are stored in row order. The splats are displayed immediately after the data is received, thus achieving progressive loading. However, it is not compressed, and there is a waste of encoding precision..SPZ files store similar types of data together in column order, thus achieving a better compression rate. Unfortunately, it cannot achieve progressive loading because the entire file must be received before any splat can obtain all its attributes.

To achieve efficient compression and streaming of 3DGS data, the Fei-Fei Li team designed a brand-new.RAD file format. This format has simple encoding and decoding, strong scalability, adjustable encoding precision, and supports random access.

▲.RAD file format (Source: World Labs blog)

The file structure is very clear: it starts with the RAD0 file header, followed by the length of the header metadata, the metadata JSON, and one or more data blocks each containing 64,000 splats. The header metadata records the offset addresses and byte sizes of all data blocks, supporting the reading of data block contents in any order.

A single data block also has a similar structure: it starts with the RADC block header, followed by the length of the block metadata, the metadata JSON, and finally the compressed data of the 64,000 splats. The attributes of the splats are stored in columns, and custom encoding methods can be selected for each. Similar data is stored together and then compressed by Gzip, resulting in an excellent compression rate.

The header uses JSON encoding, and subsequent expansion can be guaranteed through the version field and new optional fields. The data type encoding and compression algorithms are specified by string names in the metadata, which facilitates the subsequent expansion of new types.

03. Adopt Virtual Memory to Create a Fixed Video Memory Pool for 16 Million Splats

Virtual memory is a memory management technology. It provides a large-capacity virtual address space to the program based on a fixed-size physical memory and maps virtual addresses to physical addresses in units of fixed-size pages through a page table.

Spark 2.0 applies this concept to 3DGS rendering. Specifically, the Fei-Fei Li team created a fixed video memory pool on the GPU that can accommodate 16 million splats. It automatically manages the mapping between the "video memory pages" of 64,000 splats per page in the GPU and the corresponding virtual data blocks in the.RAD file.

▲ Virtual memory (Source: World Labs blog)

The data blocks will be loaded into idle pages according to the LoD traversal order; when the page table is full and the priority of a new data block is higher, the system will evict the old data according to the Least Recently Used (LRU) strategy.

Spark 2.0 supports loading multiple.RAD files simultaneously and sharing the same page table. For each file, the system will record the mapping from data blocks to the page table and the reverse mapping from the page table to the corresponding file and data.

When traversing multiple LoD splat trees, the engine will record the access order of data blocks and files, forming a globally unified priority ranking, and then optimize the loading and storage of splats for all 3DGS objects in the scene in a unified manner.

04. Conclusion: Spark 2.0 Lowers the Creation Threshold of Spatial Intelligence and Competes for the Definition Right of Infrastructure

From its debut in 2025 to the iteration of the 2.0 version today, the evolution trajectory of Spark also reflects the maturity curve of the 3DGS technology to some extent.

The delivery of 3D content has long been burdened by two major challenges: one is that the assets are too heavy, with files in the order of GB making the web end flinch; the other is that rendering is too expensive, and scenes that can only run smoothly on high-end GPUs can only be watched on mobile browsers.

Spark 2.0 uses the "three-pronged approach" of continuous LoD,.RAD format, and virtual video memory to allow high-quality 3D content to flow freely on the Internet like ordinary pictures and videos and be viewed instantly.

The Fei-Fei Li team chose to open-source this technology, which lowers the creation threshold of spatial intelligence and is also competing for the definition right of the next-generation spatial content infrastructure.

This article is from the WeChat official account “Zhidx” (ID: zhidxcom). Author: Wang Han. Republished by 36Kr with permission.

该文观点仅代表作者本人，36氪平台仅提供信息存储空间服务。

Feifei Li's world model "killer feature" goes open-source, enabling instant opening of large 3D web scenes and smooth running of 100 million point clouds on mobile phones.

01. Adopt Continuous Level of Detail to Stably Render Millions of Splats

02. Design a New File Format to Open Large-Scale 3D Worlds on the Web Instantly

03. Adopt Virtual Memory to Create a Fixed Video Memory Pool for 16 Million Splats

04. Conclusion: Spark 2.0 Lowers the Creation Threshold of Spatial Intelligence and Competes for the Definition Right of Infrastructure