These chips have become extremely popular.
Data centers are now becoming the core engine driving global economic and social development at an unprecedented speed and scale. If the past eras of PCs and smartphones defined the golden decade of the semiconductor industry, then data centers, with artificial intelligence (AI), cloud computing, and ultra - large - scale infrastructure as their core driving forces, are opening up a brand - new "chip" era.
This is not a gradual evolution but a disruptive transformation. The demand for chips in data centers is rapidly evolving from simple processors and memory to a comprehensive and complex ecosystem covering computing, storage, interconnection, and power supply. This powerful wave of demand is pushing the data center semiconductor market towards a trillion - dollar scale at an unprecedented speed.
The AI Frenzy: The "Arms Race" in Data Centers
The explosion of artificial intelligence, especially generative AI, is the most powerful catalyst for this transformation. According to industry forecasts, AI - related capital expenditure has exceeded non - AI expenditure, accounting for nearly 75% of data center investment. By 2025, this figure is expected to exceed $450 billion. AI servers are growing rapidly, and their proportion in the total number of computing servers has risen from a few percentage points in 2020 to over 10% in 2024.
Driven by basic model training, inference, and custom chip innovation, global technology giants are caught in a fierce "computing power arms race." Leading players such as Microsoft, Google, and Meta invest tens of billions of dollars annually, and small and medium - sized enterprises are also quickly following suit, as they know that future competitive advantages will directly depend on the scale of infrastructure and chip - level differentiation.
This explosive growth is giving rise to unprecedented semiconductor demand. Yole Group analysis points out that the data center semiconductor acceleration market is expected to start expanding in 2024 and reach $493 billion by 2030. By then, data center semiconductors are expected to account for over 50% of the entire semiconductor market, and the compound annual growth rate of the segmented market (from 2025 - 2030) is almost twice that of the entire semiconductor industry, which reflects the huge transformation driven by the demand for artificial intelligence, cloud computing, and ultra - large - scale infrastructure.
The Carnival of Chips
The Race between GPU and ASIC: First of all, GPUs will undoubtedly continue to dominate. Due to the increasing complexity and processing requirements of AI - intensive workloads, their growth rate is the fastest.
NVIDIA, with its powerful GPU ecosystem, is transforming from a traditional chip design company into a full - stack AI and data center solution provider. Its core weapon, the Blackwell GPU, continues to dominate this field with TSMC's advanced 4nm process.
To counter NVIDIA's market dominance, large cloud service providers such as AWS, Google, and Azure are developing their own AI acceleration chips, such as AWS's Graviton chips, and innovating in customized storage and network hardware. These technological developments are making the competition in the AI chip field even more intense. Especially in the inference and training stages, the performance differentiation of AI chips will become the core of enterprise competition.
This parallel development of GPUs and ASICs is becoming the main theme of data center computing.
HBM: With the exponential growth of AI model sizes, the bandwidth of traditional memory has become the biggest bottleneck for improving computing power. High - bandwidth memory (HBM) has emerged as a result. With its innovative 3D stacking technology, it has greatly improved memory bandwidth and capacity and has become the "standard configuration" for AI and high - performance computing (HPC) servers.
According to a research report by Archive market research, the HBM market is experiencing explosive growth and is expected to reach $3.816 billion in 2025, with a compound annual growth rate (CAGR) of up to 68.2% from 2025 to 2033. The rapid growth of HBM is becoming another strong growth point in the storage semiconductor market.
The HBM market for AI chip sets shows several key trends: First, the trends in bandwidth and capacity are obvious. Modules with a single stack of over 8 GB are becoming more and more common, which stems from the urgent need of increasingly complex AI models for faster data processing speeds; Second, power efficiency is a key factor, driving innovation in low - power HBM design; Third, the direct integration of HBM into AI accelerators is becoming more and more common, which minimizes latency and improves overall performance; Fourth, the industry is witnessing the rise of standardized interfaces, which simplifies system integration and speeds up the time - to - market of AI systems; Fifth, the adoption of advanced packaging technologies such as through - silicon vias (TSV) makes high - density, efficient HBM stacks possible; Finally, the growing demand for edge AI is driving the development of cost - effective HBM solutions for embedded systems and mobile devices.
The main players in the global HBM market mainly include SK Hynix, Samsung, and Micron Technology, and there are also several domestic manufacturers. Among them, SK Hynix and Samsung account for over 90% of the global HBM supply. Micron Technology has become the first US company to mass - produce HBM3E, and its products have been applied to NVIDIA's H200 GPU.
DPU and Network ASIC: In the era of massive data flow, efficient network interconnection is crucial. The rise of data processing units (DPUs) and high - performance network ASICs aims to share the network processing tasks of CPUs and GPUs, optimize traffic management, and thus release more computing resources. In addition, DPUs also have great advantages in terms of security, scalability, energy efficiency, and long - term cost - effectiveness.
Disruptive Technologies: Opening a New Chapter in the Post - Moore's Law Era
If AI is the core driving the demand for data center chips, then a series of disruptive technologies are redefining the performance, efficiency, and sustainability of data centers from the underlying architecture.
Silicon Photonics and CPO: Data transmission within data centers is rapidly transitioning from traditional copper cable connections to optical interconnections. Silicon photonics technology, especially co - packaged optics (CPO), is becoming the key to solving the challenges of high - speed, low - power interconnection. CPO directly integrates the optical engine into the package of computing chips such as CPUs, GPUs, and ASICs, greatly shortening the transmission path of electrical signals, reducing latency, and significantly improving energy efficiency.
Industry giants such as Marvell, NVIDIA, and Broadcom are actively deploying in this field. It is expected that by 2030, this field will generate billions of dollars in revenue. In January this year, Marvell announced the launch of a breakthrough co - packaged optical architecture for custom AI accelerators. The XPU integrated with co - packaged optics (CPO) enhances AI server performance by increasing the XPU density from dozens within a rack to hundreds across multiple racks.
The thin - film lithium niobate (TFLN) modulator is another breakthrough in the field of optical communication. It combines high - speed lithium niobate modulator technology with the scalability of silicon photonics. The TFLN modulator has ultra - high bandwidth (>70GHz), extremely low insertion loss (<2dB), a compact size, low driving voltage (<2V), and compatibility with CMOS production processes. These characteristics make it an ideal choice for realizing 200Gbps modulators per channel and high - performance connections within data centers.
CPO directly addresses the "electrical interface bottleneck" that limits plug - and - play solutions, thus achieving a key leap in power and bandwidth. This can be regarded as a breakthrough of the "electrical wall" after the "memory wall," which is crucial for dealing with the increasing power density and more efficient data transmission within and between data centers. CPO can achieve longer - distance and higher - density XPU - to - XPU connections, which promotes the further decoupling of data center architectures and allows computing resources to be distributed more flexibly.
Advanced Packaging: CPO is just an example of the application of advanced packaging technology in data centers. Through technologies such as 3D stacking and chiplets, semiconductor manufacturers can integrate chips with different functions (such as computing, memory, I/O, etc.) on the same substrate to build more powerful and flexible heterogeneous computing platforms. This "Lego - style" chip design method can not only break through the physical limits of traditional Moore's Law but also provide greater flexibility for customized chips.
Next - Generation Data Center Design: Maximizing Efficiency
DC Power Supply: With the surge in the demand for computing power from AI workloads, the power density of data centers has also soared, posing huge challenges to the traditional alternating - current (AC) power supply method. The power demand of modern AI racks has jumped from the historical 20 kilowatts to 36 kilowatts in 2023 and is expected to reach 50 kilowatts by 2027. NVIDIA has even proposed a 600 - kilowatt rack architecture, making the energy loss caused by multiple AC - DC conversions in the traditional architecture unacceptable.
Therefore, data centers are turning to direct - current (DC) power supply, which is a new paradigm for improving energy efficiency. DC power distribution can eliminate redundant AC - DC conversion steps, thereby reducing energy loss and improving overall efficiency. In a 600 - kilowatt AI rack, even a small improvement in efficiency can translate into huge energy savings. For example, providing 600 kilowatts of power at 48V would require a current of up to 12,500 amperes, which is difficult to achieve in a traditional architecture.
In this context, wide - bandgap (WBG) semiconductor materials such as gallium nitride (GaN) and silicon carbide (SiC) have become crucial. These materials have excellent electron mobility, higher breakdown voltage, and lower losses, making them ideal for high - frequency, high - voltage power conversion systems. They can achieve higher power density, design smaller and lighter power electronics systems, and reduce the need for heat dissipation, directly addressing the "energy wall" challenge faced by data centers. NVIDIA has adopted STMicroelectronics' SiC and GaN power technologies in its high - power racks to reduce cable volume and improve efficiency.
Liquid Cooling Technology: Modern data centers are facing increasingly severe heat dissipation challenges. Heat dissipation has become the second - largest capital expenditure component after power infrastructure and the largest non - IT operating expenditure. With the explosive growth of AI and HPC workloads, traditional air - cooling systems can no longer meet the demand, and liquid cooling technology has become an inevitable choice.
The liquid cooling market is expected to grow at a compound annual growth rate of 14% and exceed $61 billion by 2029. Liquid cooling technology has excellent heat dissipation capabilities. Water and dielectric liquids can absorb thousands of times more heat per unit volume than air, enabling a more compact and efficient system. Liquid cooling can reduce cooling energy consumption by up to 90%, with a power usage effectiveness (PUE) close to 1 (even as low as 1.05), and can reduce the physical footprint of data centers by up to 60% while reducing noise.
High - performance AI chips (including GPUs and custom ASICs) are pushing thermal management to new frontiers, urgently requiring advanced cooling solutions. For example, NVIDIA's GB200 NVL72 system is specifically designed to use direct - to - chip liquid cooling (DTC) technology, and major hyperscale cloud service providers such as Google, Meta, Microsoft, and AWS are accelerating the deployment of DTC cooling systems.
Liquid cooling technology mainly includes the following types:
Direct - to - chip liquid cooling (DTC): The coolant directly contacts the chip through a cold plate, which is divided into single - phase and two - phase DTC. Two - phase DTC absorbs a large amount of heat through the phase change of the coolant (from liquid to gas) and is more efficient.
Rear - door heat exchanger (RDHx): A simple way to improve cooling capacity, especially suitable for hot - spot areas in existing air - cooled environments without modifying IT hardware.
Immersion cooling: Completely immersing electronic components in a dielectric fluid, which is divided into single - phase and two - phase. This method is suitable for extremely high - density or airflow - restricted environments and provides the maximum heat transfer efficiency.
To ensure the optimal operation of the liquid cooling system, multiple sensors need to be deployed:
Temperature sensors: Monitor the inlet and outlet temperatures of the coolant and the temperatures inside the data center (aisles, racks, pipes) to ensure optimal thermal conditions and prevent thermal stress.
Pressure sensors: Monitor the coolant pressure, detect leaks or blockages, and ensure optimal flow, thereby minimizing pump failures and unexpected downtime.
Flow sensors: Ultrasonic flow meters can provide accurate real - time flow measurements of the cooling water system, thereby achieving efficiency, safety, and cost savings.
Coolant quality sensors: Oil humidity and conductivity sensors are used to continuously monitor the degradation, contamination, and moisture content of the coolant, especially for dielectric fluids in immersion cooling, providing early warnings to avoid performance degradation and equipment damage.
Currently, the industry is at a "thermal critical point." Traditional air cooling is becoming obsolete for high - density AI workloads, and liquid cooling has become a mandatory transformation rather than an optional one. This means that data center design needs to be fundamentally adjusted to meet the needs of liquid cooling, including strengthening floors, redesigned electrical systems, and real - estate selections optimized for power density rather than floor area.
Advanced thermal management not only involves cooling hardware but also includes software - driven dynamic thermal management (DTM) and AI model optimization (such as quantization and pruning) to reduce computing intensity and thermal load. This comprehensive approach is crucial for maximizing the efficiency of future data centers.
Conclusion
Looking to the future, data centers will become increasingly heterogeneous, specialized, and energy - efficient. Chip design will go beyond the traditional CPU/GPU scope and develop towards more specialized processors; advanced packaging technologies (such as HBM, CPO) will become the key to improving system performance; and DC power supply, liquid cooling, and comprehensive sensing systems will jointly build the next - generation green and intelligent data centers. This AI - driven silicon - based revolution requires continuous innovation and strengthened strategic cooperation across the entire semiconductor industry chain to jointly shape the digital future of the artificial intelligence era.
This article is from the WeChat official account "Semiconductor Industry Observation" (ID: icbank). Author: Du Qin. It is published by 36Kr with permission.