CPU, it's amazing!
As AI shifts from training to inference, CPUs are becoming increasingly important and indispensable in the AI era. This can be intuitively felt from the predictions and revisions of the future Total Addressable Market (TAM) by chip giants and research institutions:
In May 2026, AMD CEO Dr. Lisa Su raised the long - term forecast for the global server CPU market during the Q1 2026 earnings conference call. It is expected that by 2030, the TAM of server CPUs will soar at a compound annual growth rate (CAGR) of over 35%, reaching over $120 billion. Just a few months ago, the forecast for this figure was only $60 billion.
Citi, a Wall Street investment bank, was even more optimistic in its mid - May 2026 client report, pegging the global CPU market size in 2030 at $131.5 billion. Arm also locked its relevant TAM at the level of tens of billions of dollars.
NVIDIA was even more aggressive. In its Q1 fiscal 2027 earnings conference call, it mentioned that the Vera CPU would open up a "new $200 - billion TAM" for NVIDIA, and said that the company could already see a CPU revenue visibility of nearly $20 billion this year. Subsequently, a Reuters report stated that Jensen Huang confirmed that NVIDIA's $200 - billion forecast for the CPU market included the Chinese market and that the company was accelerating the production of the Vera Rubin platform.
In just a few months, the imagination space for the data - center CPU market has been further pushed up from $60 billion and $120 billion to the level of $200 billion. The CPU market is indeed expanding rapidly. So, the question is, why does the data center in the AI era suddenly need so many CPUs?
Agentic AI, Repricing the CPU
As AI infrastructure is shifting from training centers to inference centers and further to agent centers, the participation of CPUs is increasing rapidly.
In the early days, traditional large - model inference was more like a one - request, one - response process: users input questions, and the model generates answers. At this time, GPUs were responsible for most of the matrix calculations, while CPUs mainly undertook scheduling and data preparation. However, Agentic AI is different. It needs to continuously plan tasks, call tools, access databases, execute code, retrieve knowledge, coordinate multiple sub - agents, and maintain states in a long context.
AMD CEO Lisa Su pointed out in the earnings conference call that as the inference and agent - AI workloads grow rapidly, the computing demand for server CPUs is being re - amplified. The reason is that such applications do not only rely on GPUs and accelerators to complete model calculations but also require CPUs to undertake a large amount of system - level work, including task orchestration, data scheduling, parallel execution, resource coordination, and acting as the head node of GPU and accelerator clusters to uniformly schedule the entire AI system.
TrendForce cited information from AMD's earnings conference call, stating that with the development of agent AI, the industry is shifting from the long - standing configuration logic of "one CPU for four to eight GPUs" to a higher - density configuration in the next - generation data centers, and even approaching a 1:1 CPU - GPU ratio.
Intel CEO Chen Liwu also pointed out at the 54th J.P. Morgan Global Technology Conference that some customers have feedback to Intel that the configuration ratio of CPU to GPU is moving from 1:8 to 1:1, and can even reach 4:1. Chen Liwu also said that the market for physical AI (embodied intelligence) is quite large. Therefore, he believes he has an opportunity to focus on CPUs and promote new architectures to drive purpose - built workloads and optimize for specific workloads. The same is true for accelerators; they need to ensure they have them. Recently, Intel hired Alex from Qualcomm to build physical AI, covering the entire stack from silicon optimization to software and system platform engineering, and to promote solutions for robots or digital workers.
NVIDIA emphasized that today's data centers have evolved into AI factories that can generate continuous revenue. What customers really buy is not a single GPU but a complete set of AI infrastructure capabilities. Therefore, the key indicators for measuring the value of AI infrastructure are no longer the unit price of GPUs but the token output per unit of power consumption, the token output per unit of cost, system uptime, resource utilization, deployment and production speed, long - term availability of the software stack, and the full - life - cycle value of equipment assets.
The fundamental logic for the upward revision of the CPU market: The CPU is no longer just a general - purpose computing unit in a server but the system - scheduling center in an AI factory. When AI moves from training to inference, from inference to Agentic AI, and from single - server to multi - GPU clusters and AI factories, what data centers are really competing for is no longer a single GPU but a system. In this system, the CPU plays the role of maintaining order: it schedules tasks, manages memory, moves data, coordinates accelerators, processes control flows, supports database and tool calls, and organizes GPUs, DPUs, NICs, storage, and software stacks into a runnable, scalable, and billable AI factory.
The x86 Rivals, in a Fierce Battle
Judging from the data, in the procurement of server CPUs for large cloud - provider data centers, which are of higher value and more profitable, AMD has almost caught up with Intel.
In Q1 2026, AMD's data - center business revenue reached $5.8 billion, a year - on - year increase of 57%, mainly driven by the demand for EPYC CPUs and Instinct GPUs. The company also mentioned that server CPU revenue hit a record for the fourth consecutive quarter, with a year - on - year increase of over 50%. Major cloud providers are expanding the use of EPYC to support a series of AI workloads, from general - purpose computing and data processing to accelerator head nodes and emerging Agentic applications.
According to Mercury Research data, AMD's shipment share in the server CPU market has climbed to 33.2%, and its revenue share has reached a record 46.2%. This means that out of all the server CPUs shipped globally in Q1 2026, approximately one out of every three is an AMD product. The average selling price (ASP) of AMD's processors is significantly higher than that of Intel's.
This reveals a hidden concern - although Intel still controls nearly two - thirds (about 66.8%) of the server CPU shipment volume, due to the inclusion of lower - priced regular or mid - to - low - end chips, its final revenue share has been eroded to only about 53.8%.
As a traditional player in the server CPU market, Intel mainly relies on the Xeon ecosystem to cover enterprises, clouds, databases, HPC, networks, and edge computing in the long term. Xeon 6 includes two routes: P - core and E - core. The E - core focuses on high core density and performance - to - power ratio, while the P - core is for a wider range of loads, AI, and HPC, and has built - in AI acceleration capabilities in each core.
Facing the continuous diversion of the high - value data - center market, Intel is regarding Clearwater Forest, that is, Xeon 6+, as the key product for its next - round counter - attack. As Intel's first server processor based on the 18A process, Clearwater Forest is expected to be released in the first half of 2026, targeting mainly hyperscale data centers, cloud service providers, and telecom operators.
The core logic of this product is to use a more advanced process and a higher - density core design to meet the requirements of cloud customers for performance, energy efficiency, and deployment density. Clearwater Forest will provide up to 288 energy - efficient cores. Compared with the previous - generation Sierra Forest, the single - cycle instruction execution ability of its energy - efficient cores has increased by about 17%. For Intel, this is not only a new - generation Xeon processor but also a key attempt to reshape the competitiveness of server CPUs with the 18A process.
Although the CPU market is expanding, the competition is also intensifying.
AMD has integrated CPUs with GPUs and rack - level systems. AMD's sixth - generation EPYC Venice uses the Zen 6 architecture and the 2nm process, targeting cloud, enterprise, and AI workloads. Among them, Verano is AMD's first EPYC CPU specifically designed for AI infrastructure. AMD also said that Venice will cover different directions such as throughput optimization, performance - to - power ratio optimization, performance - to - price ratio optimization, and AI infrastructure optimization, and is planned to be released later in 2026.
AMD is also accelerating its investment in the supply chain. Lisa Su said in Taipei that the global CPU market demand is higher than everyone's expectations a year ago, and the CPU market has tightened. AMD is expanding production capacity with its Taiwanese partners and plans to invest over $10 billion in the Taiwanese AI industry, focusing on advanced packaging, substrates, and rack - level system manufacturing. Its partners include ASE, SPIL, PTI, Wiwynn, Wistron, Inventec, Unimicron, AIC, Nan Ya PCB, and Kinsus.
Chen Liwu revealed at the 54th J.P. Morgan Global Technology Conference that Intel 18A has supported Panther Lake to enter the mass - production stage, and the yield rate is increasing by about 7% per month, exceeding internal expectations.
In the CPU direction, he emphasized that Intel must promote deeper - level architectural changes and gradually shift to more customized chip designs. In the accelerator direction, Intel will complement its relevant capabilities through cooperation with SambaNova.
Chen Liwu said, "The combination of advanced packaging, wafer - foundry capabilities, and the new - generation CPU architecture we are introducing is very exciting." His understanding of Intel's future competition strategy is no longer limited to a single chip. Chen Liwu pointed out that in the future, it is necessary to develop in a full - stack manner, rather than just staying at the silicon or network level. Enterprises not only need to build software capabilities but also optimize memory and truly promote platform - level solutions, and finally deliver a complete chassis - level system architecture to customers. In this process, no single enterprise can work alone. Chen Liwu particularly emphasized the importance of cooperation. Intel needs partners to complement its capabilities and also needs ecological partners to support customers in implementation.
NVIDIA's Vera CPU, Making a Bold Entrance
For NVIDIA, on the one hand, it firmly holds the leading position in the data - center market with its GPUs. On the other hand, with the launch of the Vera CPU, the GPU giant is entering the CPU territory.
NVIDIA already had the Grace CPU in the past and entered the CPU - GPU super - chip stage with Grace Blackwell. According to NVIDIA's official information, the GB200 NVL72 connects 36 Grace CPUs and 72 Blackwell GPUs in a rack - level, liquid - cooled system, forming a 72 - GPU NVLink domain to serve scenarios such as real - time inference of trillion - parameter models.
Vera is an even more powerful next step.
In the Q1 fiscal 2027 earnings conference call, NVIDIA said that Vera is based on a custom Arm core and is designed in an end - to - end collaborative manner with the Rubin GPU and NVLink. Compared with x86 alternatives, Vera claims to achieve up to 1.5 times the single - core performance, 2 times the performance - to - power ratio, and 4 times the rack density. NVIDIA also said that Vera Rubin will start production and shipment in the second half of 2026, starting in Q3, and by integrating seven types of dedicated chips across five acceleration racks, it can achieve up to 35 times the inference throughput and up to 10 times the AI Factory revenue compared with Blackwell.
NVIDIA doesn't just want to make GPUs. It wants to integrate CPUs, GPUs, NVLink, DPUs, NICs, switches, and software stacks into a complete platform.
This will put pressure on manufacturers such as Intel, AMD, Arm, Broadcom, and Marvell. For Intel and AMD, NVIDIA is invading the CPU territory. For Broadcom and Marvell, NVIDIA is also compressing the system influence of external interconnection chip suppliers through the Spectrum - X, ConnectX, BlueField, and NVLink systems. For Arm, Vera is both good news and a complex signal - it strengthens the presence of the Arm architecture in the data center, but it may also make the Arm ecosystem more tied to the NVIDIA platform.
Arm Enters the Chip - Making Arena
The Server CPU Ecosystem Changes
Arm is one of the most notable variables in this round of CPU market expansion.
Arm's AGI CPU is its first truly mass - produced chip product, specifically designed for agent - AI workloads. This processor is based on the Arm Neoverse CSS V3 platform, targeting high - performance computing and high - density rack deployment. It adopts an obvious rack - first design concept. According to Arm's official information, a 36kW air - cooled ORv3 rack can be configured to deploy 30 1U servers, each equipped with two Arm AGI CPUs, providing a total of 8160 high - performance CPU cores.
In terms of specifications, the Arm AGI CPU can provide up to 136 Neoverse V3 cores, each with a 2MB L2 cache, using the Armv9.2 architecture, supporting bfloat16 and INT8 AI instructions, and having a thermal design power of 300W. At the same time, it supports 12 - channel DDR5 - 8800 memory, with each core having a maximum memory bandwidth of 6GB/s, and is equipped with 96 PCIe Gen6 channels, supporting CXL 3.0 memory expansion and interconnection capabilities.
Arm mentioned in its official blog that Meta is the leading partner and customer of the AGI CPU. The two parties jointly developed this chip to optimize Meta's deployment of gigawatt - scale AI infrastructure and work in coordination with Meta's self - developed MTIA accelerator.
Arm clearly mentioned in its Q4 fiscal 2026 earnings conference call that by FYE31, the company expects the AGI CPU revenue to reach $15 billion and the IP revenue to reach $10 billion, with a total revenue target of $25 billion.
For Arm, the AGI CPU is not just a simple product launch but an expansion of the company's business - model boundaries. In the past, Arm mainly stood at the bottom of the server CPU industry, participating in the growth of data - center chips of customers such as AWS, Google, Microsoft, NVIDIA, and Ampere through architecture licensing and IP royalties. However, the emergence of the AGI CPU means that Arm is directly entering the data - center CPU market for the first time as a finished - silicon - wafer supplier.
However, Arm's role has become more delicate. On the one hand, it can still benefit from the self - developed CPUs of cloud providers through the Neoverse architecture, CSS computing subsystem, and IP licensing model. On the other hand, after the AGI CPU enters the mass - produced chip market, Arm may also form new competitive boundaries with some customers that use its IP licenses.
Anyway, Arm is no longer just an IP company known for its mobile - end architecture. It is becoming an increasingly important underlying - architecture provider and infrastructure - level player in the data - center CPU market.
Conclusion
In the first half of the AI era, the industry remembered the rapid rise of GPUs. In the second half, the rapidly expanding CPU is proving with its strength that reshaping the new order of system - level computing power in the future of technology still depends on this long - standing master - control king.
This new blue ocean of the CPU market is destined not to be calm. Whether it is AMD's close pursuit in terms of revenue share, Intel's life - and - death counter - attack relying on the 18A advanced process, NVIDIA's all - in - one system approach with the Vera platform, or Arm's entry into the market with the AGI finished - silicon wafer, all indicate that a high - level computing - power war has just begun.
This article is from the WeChat official account "Semiconductor Industry Observation" (ID: icbank), author: Du Qin DQ, published by 36Kr with authorization.