HomeArticle

A 10,000 - character review of the First China AI Computing Power Conference! The speeches of over 15 big - name experts at the main venue are full of highlights. It's worth collecting whether you attended or not.

未来一氪2025-07-04 16:15
The 2025 China AI Computing Power Conference was held, focusing on domestic large models and technological innovation.

Author | China AI Computing Power Conference

On June 26th, a content - packed AI computing power event was held with great enthusiasm in the splendid summer of Beijing.

With the continuous advancement of the new wave of artificial intelligence represented by large models and generative AI, an unprecedented demand for AI computing power has emerged. Computing power is the new - quality productive force in the era of the digital economy and the cornerstone of AI development.

In 2025, domestic large models represented by DeepSeek broke through strongly, triggering a global boom in deployment and AI application development. This also injected new vitality into the domestic AI computing power market, driving a sharp increase in the demand for AI inference computing power. Ultra - large - scale clusters have sprung up, but the explosion of demand has also brought many challenges and brewed new industry changes.

Therefore, we initiated a summer AI gathering focusing on cutting - edge technologies and industrial trends - the 2025 China AI Computing Power Conference.

From the breakthrough and rise of domestic AI computing power to the in - depth software and hardware technology innovation of intelligent computing centers to solve the problems of computing power implementation in industries, nearly 30 heavy - weight guests attended the conference, delivering speeches, reports, and engaging in dialogues. They comprehensively deconstructed the changes in the AI computing power situation triggered by DeepSeek. There were frequent insightful remarks throughout the event, and the views of industry leaders continuously clashed, sparking inspiration. The number of on - site participants exceeded 850.

In the exhibition area outside the venue, eight companies, including Alluxio, Yanhuitong, HP, Baishan Cloud Technology, Zhonghao Xinying, Zhongke Jiahe, Kehua Data, and Xingyun Integrated Circuit, showcased their latest technologies and products. The exhibition area was bustling with people, and the atmosphere of communication was enthusiastic.

Exhibition Area

The first - ever AI Computing Power Conference, jointly initiated and hosted by Zhixingxing under Zhiyi Technology and Zhidongxi, and co - organized by Xindongxi, set the agenda around topics such as changes and innovation in the AI computing power industry, AI inference computing power, intelligent computing centers, heterogeneous hybrid training of intelligent computing clusters, and super - nodes. The main venue included a high - level forum, a special forum on AI inference computing power, and a special forum on intelligent computing centers; the sub - venue was a closed - door event, organizing a seminar on heterogeneous hybrid training technology for intelligent computing clusters and a seminar on super - node technology.

Lenovo Group's Game of AI science - popularization video was broadcast at the conference: Lenovo Neptune all - liquid cooling solution revolutionarily enhances computing power in the AI inference era.

Gong Lunchang, the co - founder and CEO of Zhiyi Technology, announced during the conference's speech session that the China AI Computing Power Conference has officially become one of the "Intelligence Leading the Future" Beijing AI series brand events.

"Intelligence Leading the Future" is a brand event in the field of artificial intelligence in Beijing, created by the Beijing Municipal Science and Technology Commission and the Zhongguancun Administrative Committee. The China Generative AI Conference, also one of the "Intelligence Leading the Future" Beijing AI series brand events, was successfully held from April 1st to 2nd this year.

Gong Lunchang also previewed two large - scale brand events to be held in the second half of the year: the 7th Global AI Chip Summit will be held in Shanghai in September, and the 2025 China Embodied Intelligent Robot Conference will be held in Shenzhen in November.

Gong Lunchang, co - founder and CEO of Zhiyi Technology

The two technology seminars on heterogeneous hybrid training of intelligent computing clusters and super - nodes were successfully held at the sub - venue. Ding Yunfan, the chief architect of AI software at Biren Technology; Ban Yourong, the technical manager of the Network and IT Technology Research Institute at China Mobile Research Institute; Ao Yulong, the person in charge of AI framework R & D at the Beijing Academy of Artificial Intelligence; Pei Zhilin, the person in charge of the compilation, computing, and localization team at the Shanghai AI Laboratory; and Liu Yefeng, the technical product director of SenseTime's large - scale device, shared their reports at the seminar on heterogeneous hybrid training technology for intelligent computing clusters.

Lu Xiaowei, the senior director of heterogeneous hardware, systems, and solutions for Alibaba Cloud's infrastructure; Wang Peng, the technical manager of the Network and IT Technology Research Institute at China Mobile Research Institute; Ye Dong, the chief network architecture expert at Qiyimoer; and Meng Huaiyu, the co - founder and CTO of Lightelligence, shared their reports from different perspectives on super - nodes. Yan Guicheng, the chief analyst of the technology industry at CITIC Construction Investment Securities, hosted the super - node technology seminar and the round - table panel.

Sub - venue

Next, we will bring you the highlights of the speeches and dialogues of more than 15 sharing guests from the three main - venue forums.

I. High - level Forum: From Thousand - Chip Nodes to Hundred - Billion Large Models, the Domestic AI Chip Ecosystem Bursts with Vitality

AI has become the core driving force for the growth of data centers. The iteration of large models has driven a sharp increase in computing power demand, promoting the comprehensive upgrade of computing, storage, and network infrastructure. Against the backdrop of strong demand for large - model training and deployment, how can idle computing power be utilized more fully? What new stage has the development of domestic AI chips reached? What innovative technologies can optimize the inference effect of large models? Six guests shared their observations and explorations of the latest industry trends.

1. Chen Yili of the China Academy of Information and Communications Technology: The Co - existence of "Computing Power Shortage" and "Computing Power Idleness", Computing Power Interconnectivity and AI Cloud Become the Focus

Chen Yili, the deputy chief engineer of the Cloud and Big Data Research Institute at the China Academy of Information and Communications Technology, said that the current large - scale application of AI has led to a surge in the demand for intelligent computing power, and the AI cloud has become the focus of the global AI wave competition. The AI cloud infrastructure needs to cover aspects such as heterogeneous and efficient scheduling capabilities, multi - model support on a single cloud, and an expert knowledge brain. The AI cloud platform promotes the intelligent and convenient construction of AI applications, enhances international influence, and helps the ecosystem thrive.

With the rise of task - based intelligent computing applications, higher requirements are put forward for the positioning, scheduling, and deployment efficiency of computing power resources. The China Academy of Information and Communications Technology, in cooperation with various industry players, is exploring the construction of a computing power Internet, actively promoting research on technologies such as computing power identification, computing power scheduling, transmission protocols, and application adaptation. It is accelerating the interconnection between existing computing power "local area networks", gradually establishing a standard system, and forming an architecture for the computing power Internet. The core goal is to solve the challenge of "finding, calling, and using" computing power, gradually forming a computing power Internet with intelligent perception, real - time discovery, and on - demand access capabilities.

Chen Yili, deputy chief engineer of the Cloud and Big Data Research Institute at the China Academy of Information and Communications Technology

2. Wang Hua of Moore Threads: The Computing Power Demand Increases a Thousand - Fold, and Large Clusters and FP8 Become Strong Demands

Wang Hua, the vice - president of Moore Threads, cited some research data: from 2020 to 2025, the computing power demand for large - model training increased nearly 1000 times, driven by the growth of both parameter scale and data volume. Taking DeepSeek - V3 as an example, the computing power required for its training reaches the 10²⁴ level, and the training time can be compressed to within 13 days on a ten - thousand - card cluster.

To meet the computing power demand, Moore Threads provides full - precision computing power including FP8, effectively supporting mixed - precision training and significantly improving training efficiency. It deploys ten - thousand - card clusters, develops a complete software and hardware stack, and provides ready - to - use products to quickly meet the computing power demand for large - model training. It also builds rich cluster monitoring and diagnostic capabilities to achieve minute - level fault location for large - scale clusters.

In addition, Moore Threads has constructed a mixed - precision training solution supporting data types such as FP8, BF16, and FP32, and open - sourced large - model training components such as Torch - MUSA, MT - MegatronLM, and MT - TransformerEngine. It has reproduced the mixed - precision training of DeepSeek - V3. Experimental results on multiple models show that the overall performance of its solution can be improved by 20% - 30%, and the training accuracy is consistent with the industry's mainstream.

Wang Hua, vice - president of Moore Threads

3. Yang Gong Yifan of Zhonghao Xinying: Interpreting the Innovative Design of the TPU Architecture, How Domestic AI Chips Can Seize Local Opportunities

Yang Gong Yifan, the founder and CEO of Zhonghao Xinying, said that dedicated AI chips are an inevitable development trend for AI infrastructure. The TPU architecture is designed for large AI models, using multi - dimensional computing units to optimize data reuse and improve computing efficiency. Through more aggressive data transmission strategies and smaller control units, it leaves more space for on - chip memory and computing units, and its scalability is more suitable for ultra - large - scale computing.

Zhonghao Xinying's fully self - developed high - performance TPU architecture AI chip, "Chana", was successfully taped out and mass - produced in 2023. Its computing performance is nearly 1.5 times that of a well - known overseas GPU chip. Based on "Chana", the high - performance AI server and large - scale AI computing cluster "Taize" support high - speed interconnection of 1024 cards and can support the computing of super - large - scale models with over a hundred billion parameters.

As the cost of large models decreases, the AI chip architecture has begun to be deeply adapted to the dynamic sparse computing paradigm, forming a new R & D model of "algorithm - defined hardware". After reducing the dependence on the CUDA ecosystem, domestic AI chips will be more flexible in architecture design to adapt to new local trends and needs by providing customized toolchains and optimizing compilers.

Yang Gong Yifan, founder and CEO of Zhonghao Xinying

4. Xu Lingjie of Magic Shape Intelligence: Large Models Need "Thousand - Chip" Super - Nodes, and There Are Five Key Factors for Future Architectures

Xu Lingjie, the founder and CEO of Magic Shape Intelligence Technology, started his speech humorously: "In the past decade, the most valuable industry in China was the real - estate industry. In the future, the most valuable one may still be real - estate, but it won't house people but machines."

Research data shows that the total power consumption of global data centers is equivalent to that of a single developed country. Stronger large models require large clusters, and faster large models require super - nodes. A larger high - bandwidth interconnection domain is the core of super - node design. Currently, the computing power density is far from high enough. To achieve a computing power density comparable to that of the human brain, "thousand - chip" super - nodes need to be constructed to build a reconfigurable AI computing power center.

How to build a thousand - chip interconnection network? Xu Lingjie summarized five key factors for future super - node architectures: ultra - high - density computing power nodes, thousand - chip multi - cabinet cascading backplane connections, 800V power supply input, full interconnection of switching chips, and full - coverage cooling.

He also shared three requirements for chips put forward by the next - generation computing power infrastructure: flexible combination and decoupling at the board - level and packaging - level, integration of optoelectronic co - packaging design, and the product concept of "Cluster First". The synergy between software and hardware will unleash the potential of ultra - large clusters.

Xu Lingjie, founder and CEO of Magic Shape Intelligence Technology

5. Cui Huimin of Zhongke Jiahe: AI Compilation Optimization Boosts Inference Performance and Effectively Expands the Domestic AI Chip Ecosystem

Cui Huimin, a researcher at the Institute of Computing Technology of the Chinese Academy of Sciences and the founder of Zhongke Jiahe, said that the demand for private deployment of large - model inference has increased significantly, but it faces multiple challenges such as a large variety of hardware, diverse requirements, and multi - model deployment.

Zhongke Jiahe has built an engine and software stack for large - model inference around compilation optimization and has accumulated a large number of practical cases: it implements in - depth video memory optimization in the inference engine to effectively improve video memory utilization; it realizes a multi - dimensional parallel strategy in large - scale inference to effectively utilize computing, memory access, and communication resources. Based on multiple joint optimizations, the inference technology has increased the QPS by more than 50% in cooperation with an Internet company and effectively supports 128K long context