HomeArticle

Oracle launches the world's largest AI supercomputer, serving as the computing power core of OpenAI's "Stargate"

新智元2025-10-21 09:11
It may consist of 800,000 NVIDIA GB300s, with a computing power of up to 16 ZettaFLOPS.

Last week, Oracle released the world's largest cloud-based AI supercomputer, "OCI Zettascale10". Composed of 800,000 NVIDIA GPUs, it has a peak computing power of up to 16 ZettaFLOPS and has become the computing core of OpenAI's "Stargate" cluster. Its original Acceleron RoCE network enables efficient interconnection between GPUs, significantly improving performance and energy efficiency. This system symbolizes Oracle's strong layout in the competition for AI infrastructure.

Oracle launched the OCI Zettascale10 supercluster at the AI World Conference in 2025.

At the AI World 2025 Conference held in Las Vegas, Oracle high - profilely introduced a cloud - based AI supercomputer called the world's largest in scale - OCI Zettascale10.

This behemoth spans multiple data centers and is composed of up to 800,000 NVIDIA GPU chips. Its peak computing performance is claimed to reach an astonishing 16 ZettaFLOPS (that is, more than 10^21 floating - point operations per second).

Such an astronomical figure means that on average, each GPU can contribute about 20 PetaFLOPS of computing power, approaching the level of NVIDIA's latest - generation Grace Hopper (Blackwell architecture GB300) chips.

Oracle's move is undoubtedly a "killer move" in the rapidly heating up AI computing power arms race, attempting to occupy a place in the cloud - based AI infrastructure landscape.

The Power Source of OpenAI's Giant Cluster

This Zettascale10 system has become the unsung hero behind OpenAI's huge computing power demand.

It is reported that Oracle and OpenAI jointly built the "Stargate" flagship AI supercomputing cluster in Abilene, Texas, and OCI Zettascale10 is its computing power backbone.

Peter Hoeschele, the vice - president of OpenAI's infrastructure department, said that Oracle's customized RoCE high - speed network architecture maximizes the overall performance at the "gigawatt - scale" while using most of the energy for computing.

In other words, the RDMA over Converged Ethernet network (codenamed Acceleron) developed by Oracle tightly connects a large number of GPUs into a whole, enabling OpenAI's large - model training to run efficiently on such a huge chip array.

Thanks to the in - depth cooperation with OpenAI, Zettascale10 comes with a "real - combat" aura as soon as it is unveiled. It has been powering some of the most demanding AI workloads in the industry today.

Unveiling the Acceleron Network Architecture

The secret to the efficient operation of such a large - scale GPU "matrix" lies in Oracle's original Acceleron RoCE network architecture.

To put it simply, Acceleron allows the network interface card (NIC) of each GPU to act as a small switch, which can be connected to multiple isolated network switching planes at one time.

This multi - plane, flat network design significantly reduces the communication delay between GPUs and ensures that even if a certain route fails, the training job can automatically switch to other paths to continue running without being forced to interrupt.

Compared with the traditional three - level switching structure, Acceleron reduces the network hierarchy, makes the direct - connection delay between GPUs more consistent, and makes the overall performance more predictable.

In addition, this architecture introduces new technologies such as linear pluggable optical modules (LPO) and linear receiving optical components (LRO), which reduce the network's energy consumption and cooling costs without reducing the 400G/800G bandwidth.

Oracle claims that this innovative network not only improves efficiency but also reduces costs, allowing customers to complete the same AI training tasks with less power;

Ian Buck, an executive of NVIDIA, also recognized that it is this full - stack optimized "compute fabric" that provides the foundation needed to advance AI from experimentation to industrialization.

The Myth of Peak Performance and Real - World Tests

Oracle plans to officially provide the Zettascale10 cluster service to customers in the second half of 2026, and the system has started accepting reservations.

However, many industry observers are skeptical about the astonishing computing power of 16 ZFLOPS.

This data has not been verified by an independent institution, and it is likely to be based on theoretical peak computing power rather than continuous actual effectiveness.

According to industry reports, the 16 ZFLOPS claimed by Oracle may be achieved by using extremely low - precision AI computing indicators (such as FP8 or even 4 - bit sparse operations).

Actual large - model training usually requires the use of higher - precision (such as BF16 or FP8) numerical formats to ensure the model's convergence effect. Therefore, the figure of 16 ZFLOPS more reflects the upper - limit potential of Oracle's hardware under ideal conditions rather than the sustainable performance under daily workloads.

The real combat performance of this "cloud behemoth" remains to be tested by time. Only when the system is put into use next year can various benchmark tests and actual user feedback reveal whether it can be as efficient and reliable as claimed.

Challenges and Prospects in the Cloud - Based AI Race

Oracle is not alone in this race.

Currently, cloud computing giants such as Microsoft, Google, and Amazon are also competing to build their own large - scale AI clusters. They either purchase a large number of GPUs or develop their own AI acceleration hardware, and the cloud - based AI computing power landscape is expanding rapidly.

Oracle's heavy investment in launching Zettascale10 not only consolidates its strategic alliance with OpenAI but also declares its new and non - negligible strength in the AI era to the industry.

However, in terms of market prospects, Oracle still faces the problem of how to attract customers.

To this end, the company also launched a new "multi - cloud universal credit" program, allowing operators to freely allocate Oracle databases and OCI services among Oracle Cloud, AWS, Azure, Google, and other cloud services with unified prepaid credits.

This initiative aims to lower the customer migration threshold and improve platform stickiness, striving for a larger user base for the Oracle Cloud ecosystem.

The emergence of OCI Zettascale10 demonstrates the bold exploration of cloud service providers to meet the unprecedented computing power demand of AI.

Only when this system is truly implemented next year can we know whether Oracle can seize the opportunity in the fierce AI infrastructure competition with this cloud "behemoth" and fulfill its promises of efficiency, scale, and reliability.

Reference Materials

https://www.oracle.com/news/announcement/ai-world-oracle-unveils-next-generation-oci-zettascale10-cluster-for-ai-2025-10-14/

This article is from the WeChat public account "New Intelligence Yuan". The author is Allen. It is published by 36Kr with authorization.