Oracle hat die weltweit größte KI-Supercomputer-Infrastruktur vorgestellt, die als Rechenleistungskern für OpenAIs "Stargate" fungiert.
Last week, Oracle introduced the world's largest cloud AI supercomputer system, the "OCI Zettascale10". It consists of 800,000 NVIDIA GPUs and offers a peak computing power of 16 ZettaFLOPS. Thus, it becomes the computing power core of OpenAI's "Stargate" cluster. The independent Acceleron RoCE network enables efficient networking between the GPUs and significantly improves performance and energy efficiency. This system symbolizes Oracle's strong positioning in the competition for AI infrastructure.
Oracle presented the OCI Zettascale10 supercluster at the AI World Conference 2025.
At the AI World 2025 Conference in Las Vegas, Oracle impressively introduced a cloud AI supercomputer, which is described as the world's largest system - the OCI Zettascale10.
This giant spans multiple data centers and consists of up to 800,000 NVIDIA GPU chips. It is claimed that the peak computing power can reach an astonishing 16 ZettaFLOPS (that is, over 10^21 floating - point operations per second).
This astronomical figure means that each individual GPU can contribute an average of about 20 PetaFLOPS of computing power, which is close to the performance of the latest NVIDIA Grace Hopper chip (GB300 with Blackwell architecture).
With this step, Oracle clearly sets a milestone in the rapidly heating - up race for AI computing power and tries to secure a place in the cloud AI infrastructure map.
The driving force for OpenAI's huge cluster
This Zettascale10 system is already the unseen helper behind OpenAI's immense computing power requirements.
It is known that Oracle and OpenAI have built the "Stargate" AI supercomputer cluster in Abilene, Texas, and the OCI Zettascale10 is the computing power carrier of this cluster.
Peter Hoeschele, vice - president of OpenAI's infrastructure department, has explained that Oracle's customized RoCE high - speed network architecture maximizes the overall performance in the "gigawatt range" and at the same time uses most of the energy consumption for the calculations.
In other words, the RDMA over Converged Ethernet network (with the code name Acceleron) developed by Oracle connects the countless GPUs into a unit, so that the training of OpenAI's large models can run efficiently on this huge chip array.
In view of the close cooperation with OpenAI, the Zettascale10 system already has a "battle - tested" shine at its premiere. It already provides the energy for some of the most demanding AI workloads in the industry.
The revelation of the Acceleron network architecture
The secret behind the efficient operation of this huge GPU arrangement lies in Oracle's independent Acceleron RoCE network architecture.
Put simply, Acceleron allows the network interface card (NIC) of each GPU to act as a small switch, which can be connected to multiple isolated network planes at the same time.
This multi - plane, flat network design significantly reduces the communication delay between the GPUs and ensures that the training tasks can be automatically redirected to other paths even if one route fails and do not have to be interrupted.
Compared with the traditional three - stage switch structure, Acceleron reduces the network levels, so that the direct delay between GPU and GPU becomes more consistent and the overall performance is more predictable.
In addition, this architecture introduces new technologies such as linear pluggable optical modules (LPO) and linear optical receiving components (LRO), which reduce the energy consumption and cooling costs of the network without reducing the 400G/800G bandwidth.
Oracle claims that this innovative network both increases efficiency and reduces costs, so that customers can complete the same AI training tasks with less power.
Ian Buck, a senior employee of NVIDIA, agrees that it is precisely this comprehensively optimized "compute fabric" that provides the foundation to bring AI from the experimental phase to the industrial phase.
The myth of peak performance and the reality check
Oracle plans to officially offer the Zettascale10 cluster services to customers in the second half of 2026. Currently, this system can already be reserved.
Nevertheless, many industry observers have reservations about the astonishing computing power of 16 ZFLOPS.
These data have not yet been verified by an independent institution and probably come more from a theoretical peak performance than from a continuous actual performance.
According to industry reports, the 16 ZFLOPS performance claimed by Oracle could be achieved using AI calculation indices with very low accuracy (e.g., FP8 or even 4 - bit sparse calculations).
The actual training of large models usually requires numerical formats with higher accuracy (e.g., BF16 or FP8) to ensure the convergence of the model. Therefore, the figure of 16 ZFLOPS rather reflects the upper potential of Oracle's hardware under ideal conditions than the performance that can be continuously achieved under normal workloads.
The real operational performance of this "cloud giant" still has to be left to time. Only when the system is put into operation next year can various benchmark tests and the actual feedback from users show whether it is as efficient and reliable as claimed.
Challenges and perspectives in the cloud AI race
Oracle is not alone.
Currently, cloud computing giants such as Microsoft, Google, and Amazon are also striving to build their own large AI clusters. They either buy countless GPUs or develop their own AI acceleration hardware. The map of cloud AI computing power is expanding rapidly.
With the introduction of the Zettascale10, on the one hand, Oracle consolidates its strategic alliance with OpenAI, and on the other hand, it signals to the industry its new, non - negligible strength in the AI era.
But in the market, Oracle still has to solve the problem of how to attract customers.
For this purpose, the company has announced a new program with "multi - cloud credits", which allows providers to freely allocate prepaid credits to Oracle databases and OCI services in Oracle Cloud as well as in AWS, Azure, Google, and other cloud services.
This step is intended to lower the threshold for customer migration and increase platform stickiness to gain a larger user base for the Oracle cloud ecosystem.
The emergence of the OCI Zettascale10 shows the bold attempts of cloud providers to meet the immense computing power requirements of AI.
We will only know whether Oracle can gain an advantage in the fierce competition for AI infrastructure with this cloud giant and fulfill its promises regarding efficiency, size, and reliability when the system is actually put into use next year.
Sources
https://www.oracle.com/news/announcement/ai-world-oracle-unveils-next-generation-oci-zettascale10-cluster-for-ai-2025-10-14/
This article is from the WeChat account "New Intelligence World", author: Allen. 36Kr published it with permission.