Alibaba PPU, Baidu Kunlun Chips, China's AI Enters the "Huawei Moment"
The domestic AI chip market is undergoing a profound transformation, and "de-NVIDIAization" has become a buzzword.
The core of this transformation lies in the fact that Chinese technology giants represented by Alibaba and Baidu are actively promoting the independent R & D of AI chips, attempting to challenge NVIDIA's monopoly position in the domestic AI chip field.
Since September, there have been frequent good news about domestic AI chips: Internet giants such as Alibaba and Baidu have successively announced that they will partially use self-developed chips for the training of their core AI models. Meanwhile, the performance of the new generation products of Alibaba's T-Head and Huawei's Ascend has been exposed, with their performance catching up with and even partially surpassing that of NVIDIA.
The capital market has responded positively. In addition to many investment banks raising the valuations of domestic technology giants such as Alibaba and Baidu, Cathie Wood, a star fund manager on Wall Street, bought Alibaba for the first time in four years. In terms of stock performance, since the end of August, the cumulative increase in the Hong Kong stocks of Baidu and Alibaba has been around 50%.
Figure: Performance data of Alibaba and Baidu's Hong Kong stocks. Data source: Wind, compiled by 36Kr.
So, what exactly are the reasons behind the shift from purchasing foreign AI chips to self-developing them in China?
Accelerated "de-NVIDIAization" of domestic chips
The most direct driving force behind this "de-NVIDIAization" movement stems from the increasingly tense geopolitics and the resulting deep concerns about the stability and security of the AI supply chain.
In April this year, the US government once banned NVIDIA from selling H20 chips to China. Although the export was resumed in July, it was subject to a condition of surrendering 15% of the revenue. In response to the US restrictions, China's countermeasures have been continuously escalating: in late July, the H20 was reported to have "vulnerability backdoors" and was summoned for talks; in mid-August, there were rumors of production suspension; and the recent anti-dumping investigation has pushed this storm to a climax.
Figure: Dynamics related to AI chips in 2025. Data source: Central China Securities, compiled by 36Kr.
The escalation of the game between the two countries has intensified the supply chain risks of overseas AI chips, which is undoubtedly fatal for AI players who need long - term and stable investment. Considering the risks, more and more Chinese technology giants have realized the importance of self - controllable chips, thus setting off a huge wave of "de-NVIDIAization".
This wave has brought obvious negative impacts to NVIDIA.
In the first fiscal quarter of this year, NVIDIA made an inventory impairment provision of about $4.5 billion due to the export restrictions on H20. As the storm intensifies, NVIDIA's revenue from the Chinese mainland has continued to decline sharply. The financial report shows that in the second quarter of fiscal year 2026, its revenue from the Chinese mainland dropped sharply to $2.77 billion, a quarter - on - quarter decline of nearly 50%, and its proportion dropped to 6%. In the same period, the revenue growth rates of the United States, Singapore, and Taiwan region of China have all increased.
Figure: NVIDIA's revenue and proportion from the Chinese mainland. Data source: Wind, compiled by 36Kr.
In sharp contrast to NVIDIA's dilemma, domestic customized AI chips are rising rapidly under this wave.
On August 21st, DeepSeekV3.1 was released, announcing the use of the FP8 architecture to enhance the adaptability of domestic chips. On September 16th, the PPU chip of Alibaba's T - Head was unexpectedly exposed on the "News Broadcast". Its video memory capacity and inter - chip bandwidth have exceeded those of NVIDIA's A800 and are comparable to those of H20. More importantly, according to data from China Merchants Bank International, thanks to the domestic 7nm process and 2.5D packaging, the cost of a single PPU card is 40% lower than that of the imported H20.
Figure: Information on domestic AI chips. Data source: Shanxi Securities, compiled by 36Kr.
Only two days after the exposure of the PPU, on September 18th, Huawei rarely announced the detailed evolution roadmap of its Ascend chips for the next three years. By supporting low - precision computing, hybrid architecture, and doubling the inter - connection bandwidth and computing power, Huawei is comprehensively catching up in terms of technology. Beyond the performance of a single card, more importantly, the Atlas950 SuperPod based on the self - developed inter - connection protocol "Lingqu" and the Ascend 950 series of chips can form a unified computing power base of one million - level scale, with various performances exceeding NVIDIA's next - generation NVL144 and NVL576 in 2027, becoming the world's most powerful computing cluster.
Figure: Progress of Huawei's Ascend chips. Data source: Great Wall Securities, compiled by 36Kr.
The breakthrough in product performance has also accelerated the deployment of domestic solutions for domestic computing power infrastructure. At the end of August, Baidu's Kunlun Chip won the first place in three bid packages in China Mobile's centralized procurement, with the winning bid scale reaching one billion - level.
This is more like a mirror, clearly reflecting that domestic AI chip manufacturers are accelerating their encroachment on NVIDIA's market share. IDC data shows that in 2024, NVIDIA's market share in China dropped from 85% to 70%, while the shipment volume of domestic AI chip brands exceeded 820,000, and the market share significantly increased to 30%.
Figure: Continuous decline of NVIDIA's market share in China. Data source: IDC, compiled by 36Kr.
Bernstein predicts that in 2025, NVIDIA's share in the Chinese AI chip market will further drop to 54%, while the share of domestic manufacturers will increase significantly, presenting a new pattern of diverse competition.
Figure: Evolution of the domestic AI chip market pattern. Data source: Bernstein, compiled by 36Kr.
Historical mirror: The path from "general - purpose" to "customized" for mobile phone chips
The current wave of customization of Chinese AI chips is very similar to the development process of mobile phone chips in the past decade or so.
In the early days of the development of smartphones, the dominant players in the chip field were general - purpose chip manufacturers such as Qualcomm and MediaTek. The advantage of these chip solutions is that they have a high degree of compatibility and standardization, which can significantly lower the R & D threshold for mobile phone manufacturers, enabling them to quickly deploy smartphone business and seize market opportunities.
However, with the iteration of the industry, the drawbacks of general - purpose chips have begun to emerge.
Firstly, mobile phone chips have long been monopolized by a few enterprises such as Qualcomm and MediaTek, which has led to mobile phone manufacturers being subject to supply instability for a long time and having to bear high additional costs, squeezing their profits. Taking the "Qualcomm tax" as an example, Apple has to pay Qualcomm a patent fee of 5% of the selling price for each iPhone sold. In 2016, Apple's patent fees reached as high as $2.8 billion, accounting for 6% of its profit that year.
Secondly, the architecture design of general - purpose chips cannot fully match the product iteration plans and customization needs of mobile phone manufacturers, resulting in a lag in product performance improvement and difficulty in forming a synergistic effect of software and hardware integration, weakening the user experience.
Thirdly, the convergence of core hardware has made mobile phone manufacturers only able to make "spec - stacking" innovations in external aspects such as cameras and screens, making it difficult to form real differentiated barriers and brand premiums, and hindering the high - end development of brands.
It is precisely because of these obvious defects that leading manufacturers represented by Apple have embarked on the path of self - developing chips, promoting the transition of smartphone chips from "general - purpose" to "special - purpose".
In 2010, Apple launched its first self - developed chip, the A4, which laid the foundation for the iPhone's dominant position in the smartphone field. The A - series chips use self - developed architectures and advanced process technologies and are closely coordinated with the scheduling logic of the iOS system, achieving comprehensive optimization of software and hardware. This not only ensures the continuous leading position of the iPhone's hardware performance but also forms a unique technological ecosystem based on software - hardware synergy, making the iPhone's user experience far ahead and building an insurmountable moat for Apple, which is the key for it to remain in the first echelon of high - end smartphones for a long time.
After Apple's success, Huawei also followed suit and started the path of self - developing chips.
In 2013, Huawei self - developed the Kirin chip through HiSilicon, integrating Huawei's core technologies in communication, AI, and image processing. This not only optimized the overall performance but also gave it a first - mover advantage in the 5G era. More importantly, the deep integration of the Kirin chip and the Hongmeng system has built a strong ecological moat for Huawei's mobile phones, enabling it to completely get rid of the label of a "mobile phone assembler" and gain a foothold in the domestic high - end market with its differentiated advantages.
The more far - reaching impact is that by relying on customized chips, the two companies have reduced their dependence on external suppliers and fundamentally optimized their cost structures. Moreover, the "software - hardware integration" ecological advantage formed in this way has also continuously increased their brand premiums, bringing them more considerable profit margins. In 2024, the gross profit margin of Apple's iPhone business was close to 40%, far higher than the industry average gross profit margin.
The "Huawei moment" of domestic AI chips
Currently, the wave of "de-NVIDIAization" of Chinese AI chips is a deep replication of the development process of mobile phone chips.
In essence, the localization and customization of AI chips are not only a consideration for supply chain security but also an inevitable choice for the industry after AI shifts from training to inference.
As the iteration speed of large models slows down, the market demand is shifting from "crazy computing power stacking" to more practical commercial application implementation. Against this background, the focus of AI has also shifted from "training" to "inference". According to the speech of NVIDIA's CEO at the earnings conference of the first fiscal quarter of fiscal year 2026, the generation volume of AI inference tokens has increased tenfold in the past year.
Compared with training, inference tasks have lower computing power requirements but higher requirements for cost, power consumption, and latency. Although NVIDIA's general - purpose GPUs are powerful in performance, they have high costs, low energy efficiency, and high - latency problems, so they cannot perfectly match the needs of inference tasks. Especially, the relatively high cost of the restricted version in the domestic market has greatly reduced its cost - performance ratio.
This change in market demand has directly promoted the customization path of the domestic chip industry.
In terms of adaptability, compared with general - purpose GPUs, customized chips remove a large number of redundant functions, so they can achieve an order - of - magnitude optimization in power consumption, cost, and latency when performing specific tasks. For AI inference tasks that require large - scale, high - concurrency, and low - latency, the efficiency is much higher than that of general - purpose GPUs.
Figure: Comparison between general - purpose chips and customized chips. Data source: Minsheng Securities, compiled by 36Kr.
The further maturity of domestic chip design and supply chain has also given domestic chips the confidence to catch up with international levels in performance, making it possible to shift the AI computing power infrastructure to domestic solutions.
Just like Apple and Huawei in the smartphone era, Chinese AI players are no longer satisfied with just buying NVIDIA's general - purpose GPUs. Instead, they are starting to try a two - pronged approach of purchasing foreign chips and self - developing chips.
On the one hand, in the training field, they rely on the high performance of international advanced chips to achieve model iteration, leaving time for further autonomy; on the other hand, they accelerate the self - development of customized chips and actively adapt to mainstream large models at home and abroad, seeking differentiated advantages in energy efficiency and specific scenario optimization, and optimizing efficiency and cost through deep software - hardware synergy.
This indicates that the Chinese AI industry is transforming from a simple consumer of computing power to an independent ecological builder. This is not only a defensive strategy to cope with external pressure but also an inevitable choice for the Chinese technology industry to move towards a higher value chain.