HomeArticle

COMPUTEX Express: Intel Data Center Unveils 288-Core CPU and 480GB VRAM GPU for the First Time, 18A is Here

半导体产业纵横2026-06-01 11:51
Intel showcases its full range of data center products.

The data center is currently the hottest track in the entire technology industry.

NVIDIA's data center business has set records for multiple consecutive quarters, and AMD's EPYC series has shown strong growth. The demand for AI inference is triggering a computing power arms race. At this time, Intel's 18A process is maturing, and the transformation of its foundry business is entering a critical stage.

Today, Intel is presenting its full - line of products in the data center field for the first time, including CPUs, GPUs, and network cards, leaving nothing out. This is not just a product showcase but also a strategic direction release. The combination of these three things makes this press conference an excellent window to observe Intel's strategic direction.

Xeon 6+: A 288 - core monster makes its debut, with the first appearance of the 18A process

The most important product in this event is undoubtedly the Xeon 6+.

This is Intel's first application of the Intel 18A process to data center processors. More importantly, it uses Foveros Direct 3D packaging technology, stacking 18A - based compute chips on top of Intel 3 base chips and then completing the interconnection with EMIB technology. The entire package consists of 29 components: 12 compute chips, 3 Active base chips, 2 I/O chips, and 12 EMIB interconnection Tiles.

A single processor can have up to 288 energy - efficient cores. This is the highest core density in the industry currently. Coupled with a last - level cache (LLC) of up to 576MB (more than 5 times that of the previous generation) and DDR5 memory with a speed of 8000 MT/s, the memory subsystem has been significantly upgraded. Under mainstream workloads, the overall performance can be increased by up to 2.26 times, and the performance per watt can be increased by up to 1.55 times. Compared with competitors, the per - thread performance of Xeon 6+ is 1.3 times higher, and the per - thread performance per watt is also 1.3 times higher.

The 18A process brings two key technological upgrades: PowerVia enables shorter and more direct power supply paths, effectively reducing power consumption; RibbonFET reduces standby power consumption while enhancing performance consistency.

A relatively intuitive statistic: Compared with the second - generation Xeon, Xeon 6+ can achieve a server consolidation ratio of 9:1, reducing rack space occupation by nearly 80% and energy consumption by 73%. This has a huge impact on operators who are struggling with data center energy consumption and heat dissipation issues. Ericsson tested Xeon 6+ in the packet core network in a real - world operator deployment: Compared with the previous - generation E - core, the performance increased by 30% with the same number of cores, rack power consumption decreased by 38%, and performance per watt increased by more than 60%.

Intel also introduced a brand - new hard feature: Intel AET (Application Energy Efficiency Telemetry Technology). It can monitor CPU power consumption in real - time at the workload level, enabling data center operators to achieve more precise energy - efficiency optimization and cost allocation. For cloud service providers and large - enterprise data centers, this means more controllable TCO and more precise resource scheduling.

The first data center GPU, with 480GB of video memory for a decisive advantage

If Xeon 6+ is Intel's way of maintaining its core business, then Crescent Island is their first official entry into the data center GPU battlefield. This is the first data center GPU based on the Xe3P architecture, optimized for AI inference and Agent workloads. Its core parameters are extremely impressive: 480GB of LPDDR5 memory and a TDP of 350W.

The number 480GB has special significance. Taking DeepSeek - V4 (1.6T parameters) as an example, only 4 Crescent Island GPUs are needed to support deployment under FP8 quantization precision. Longer context windows and more model switching, which occur frequently in Agent workflows, become more operable due to the large - capacity memory. The choice of LPDDR instead of HBM reduces the power consumption to 350W, which means it can run directly in existing air - cooled data centers without liquid - cooling modification.

At the same time, Crescent Island supports native FP64. This makes it not only an AI inference card but also lays the foundation for future entry into the HPC market. Intel is promoting a software stack that combines CPU and Crescent Island internally to support better HPC applications. Intel is clearly controlling the product boundaries. They removed the capabilities unnecessary for some general scenarios and released the transistor area for AI performance.

At the software level, Intel builds a unified Xe software stack around four principles: openness, scalable performance, excellent user experience, and support for heterogeneous infrastructure. Intel has chosen an upstream - first strategy: mainstream frameworks such as PyTorch, vLLM, and SGLang will be supported from the very beginning. They have also reached a cooperation with SambaNova. The latter focuses on high - throughput, low - latency centralized inference in large - scale systems, while Crescent Island targets small - scale deployments at the edge and enterprise levels (such as 8 - card or 16 - card all - in - one machines).

Currently, more than 20 OEM and ODM manufacturers are developing products based on Crescent Island. This number signals the accelerating expansion of Intel's ecosystem.

In addition to CPUs and GPUs, a new E835 Ethernet network card was also released this time. It has a throughput of up to 200GbE, supports RDMA and Dynamic Device Personalization (DDP). When running at a full - load 200G bidirectional line speed, its power consumption is 28% to 47% lower than that of similar products, and its energy - efficiency ratio is 1.4 to 1.9 times that of competitors. It has built - in hardware - level security capabilities such as a silicon chip root of trust and firmware attestation, and a product lifecycle of more than 10 years, providing a more stable technology investment guarantee for long - term data center operations.

In the era of Agentic AI, the CPU returns to the center stage

In the past two or three years, AI inference was almost equivalent to the work of GPUs. However, with the rise of Agentic AI, the rules of the game are being rewritten. Kevork Kechichian, Executive Vice President of Intel Corporation and General Manager of the Data Center Group (DCG), said: "The CPU is now at the center of all these processes, trying to orchestrate and schedule the entire situation."

The Agent workflow involves multiple steps, multiple inferences, and multiple calculations, and requires maintaining a very long context window. Multiple expert Agents will spawn multiple sub - Agents to collaborate on complex tasks, resulting in an exponential increase in Token consumption. In this scenario, the GPU is responsible for thinking (inference, code generation), while the CPU is responsible for execution (orchestration, scheduling, simulation, context management). The relationship between the two is evolving from the traditional 1:8 ratio to 1:4, 1:2, or even 1:1, and in reinforcement learning scenarios, it may even be reversed.

This explains why the high core density of Xeon 6+ is so important. Intel's actual tests show that based on the 288 - core Clearwater Forest, more than 400 to 500 Agents can be easily deployed to run concurrently. More importantly, the built - in accelerators (matrix engine, vector engine) and confidential computing capabilities (TDX, SGX) of the CPU exactly meet the strict requirements of the Agent scenario for data privacy and security isolation. When multiple Agents run in parallel and multiple tenants are scheduled in parallel, TDX and SGX can ensure that private information runs within a secure and controllable range on a trusted platform.

x86 will still dominate in 2030

The influence of the x86 architecture in the data center has not been weakened by the AI wave; instead, it has been re - strengthened in some key scenarios.

Intel divides workloads into three major categories: scale - out scenarios that require high - density computing, general scenarios that balance performance and data throughput, and AI training scenarios that are computationally intensive. However, a new intermediate area is emerging outside of these traditional classifications. The hybrid scenario on the inference side: GPU - level acceleration, but the main body is still centered around the CPU.

The rise of this third - type scenario is more significant than it seems. There are significant differences between AI inference and training. Training requires large - scale parallel computing, and GPUs are the absolute main force. However, in the inference stage, especially in enterprise - level Agent workflows, which involve multi - step inference, context management, scheduling, and simulation, these are exactly the strengths of the CPU. When Token consumption increases exponentially, when multiple Agents run in parallel, and when a very long context window needs to be continuously maintained, the CPU is no longer a bystander but the orchestration center of the entire system.

Intel presented a figure at the press conference: It is estimated that by 2030, 80% of the more than 80 million networked servers worldwide will still be based on the x86 architecture. Currently, inference and Agent AI almost completely run on x86.

The accumulation of a wide - ranging x86 software ecosystem and developer community, the hardware acceleration capabilities polished over the years (such as IAA memory compression, CXL memory expansion), and the mature manageability and security features, which may have been just "basic skills" in the past, have suddenly become treasures in the era of Agent AI. High memory costs and surging capacity requirements have brought IAA technology back into the spotlight of customers; the ability of the CXL memory pool makes it possible to share the cache hierarchy across CPUs.

Intel is also addressing this differentiation through fine - tuning at the architectural level. For different workloads, they are simultaneously promoting two routes: P - core (performance core) and E - core (energy - efficient core). The P - core has prominent performance advantages in general computing and has received positive feedback from customers; while the E - core is becoming increasingly indispensable in high - density, low - power Agent scenarios. The parallel development of the two types of cores, rather than an either - or choice, provides more flexible support for the market positioning of x86 in the AI era. ARM has been deploying in the server field for many years, but the ecological barriers and maturity of x86 are still difficult to shake in the foreseeable future.

From chips to rack - level, Intel's ambitions

Intel's layout in the data center does not stop there. According to the roadmap, Intel will next launch Diamond Rapids, which is expected to be released in 2027 and will use the 18A P process, a more advanced process node than the 18A used in Xeon 6+. It uses a Scalable SOC architecture and introduces a Uniform Memory Latency design. In terms of key memory and I/O, the number of channels in Diamond Rapids is twice that of the previous generation, the memory speed is comprehensively improved, and PCIe supports Gen6, providing stronger support for bandwidth - limited and I/O - intensive applications.

In terms of application scenarios, Diamond Rapids targets high - demand IaaS environments, high - performance computing, bandwidth - intensive applications, and I/O - intensive workloads, which are exactly the directions of infrastructure upgrades driven by AI inference and Agent workflows.

From Xeon 6+ to Diamond Rapids, the process node, product density, memory bandwidth, and I/O performance are all systematically improved in each generation. It is worth noting that the rapid maturity of the Intel 18A process supports Intel's data center products. Starting from Xeon 6+, all core products are based on the 18A process, which not only means higher performance and energy - efficiency ratio but also means that Intel's product planning and process nodes are finally in sync.

In 2026, the data center market is experiencing a profound