The 2024 China Generative AI Conference Shanghai Stop Concluded Successfully. See All the Highlights of the AI Infra Summit Speeches on the Second Day in One Article!
On December 6, 2024, the two-day 2024 China Generative AI Conference (Shanghai Station) concluded successfully today.
Within two days, 51 representatives from industry, academia, research, and investment provided a wealth of practical insights. The number of people who registered and consulted for the conference exceeded 4,000, and more than 1,200 attendees participated in person. Among them, the online viewership of the Large Model Summit and the AI Infra Summit held in the main venue exceeded 1.04 million.
The enthusiasm of the on-site attendees was extremely high. The main venue and sub-venues were fully occupied, and the industrial exchanges near the exhibition area were also very active. Many new products and technologies of 15 enterprises have attracted widespread attention and discussion.
▲ Conference exhibition area
This conference is themed "Intelligent Leap, Creating Infinity. The 51 representatives from industry, academia, research, and investment deconstruct and diagnose the technological product innovation, commercial landing solutions, future trend directions, and cutting-edge research focuses of generative AI based on a forward-looking perspective.
At today's AI Infra Summit, Dai Guohao, an associate professor at Shanghai Jiao Tong University and the co-founder and chief scientist of Wuwen Xinqiong, believes that the industry should pay more attention to how to achieve more efficient token throughput per unit of computing power. The actual available computing power of large models not only depends on the theoretical computing power of the chip but also can improve the computing power utilization efficiency through software and hardware collaborative optimization and amplify the overall computing power scale through multi-heterogeneous adaptation.
Guo Wen, the head of Beidian Digital Intelligence Computing Cloud, King.Cui, the president of GMI Cloud Asia-Pacific, Cong Peiyan, the head of the Alibaba Cloud Intelligent Computing Cluster Product Solution, Zhu Guoliang, the head of the Zhonghao Xinying Chip Software Stack, and Zhou Qiang, the founder and chairman of Guangyu Xinchen, respectively shared their views on several topics such as the Full-Stack AI Factory, How AI Enterprises Going Global Can Make Up for the Computing Power Shortage, High-Performance Intelligent Computing Cluster, "No CUDA" Software Stack for Domestic TPU Chips, and The Road to Personal Large Models.
Gao Xuefeng, the founder and CEO of Fengqing Technology, Mao Yujie, the head of the Generative AI Product of Sound Network, Xie Yu, the technical head of Tencent Cloud Vector Database, Wang Nan, the co-founder and chief technology officer of Jina AI, Luan Xiaofan, the partner and R & D VP of Zilliz, Zhang Yingfeng, the founder and CEO of Yingfeiliu, and Fu Zhengjia, the chief architect of Alluxio, respectively brought wonderful speeches on "From Data to Knowledge: The Cornerstone of AI Reshaping Hundreds of Industries", "Generative AI-Driven Technological Change and Experience Innovation in Real-Time Interaction", "TencentVDB Vector Database", "Opportunities and Challenges of AI Infra under the RAG Paradigm", "Although RAG is Strong, Vector Database is Not a Panacea", "The New Generation of Enterprise-Level Multi-Modal RAG Engine", and "High-Performance AI Data Base".
In the afternoon roundtable discussion focused on "When Large Models Enter the Deep Waters, New Changes and Opportunities of AI Infra", hosted by Liu Jingyuan, the executive director of Delian Capital, and three guests, Fu Zhengjia, the chief architect of Alluxio, Luan Xiaofan, the partner and R & D VP of Zilliz, and Zhang Yingfeng, the founder and CEO of Yingfeiliu, gave their insights.
On the first day of the conference, 17 guests discussed cutting-edge topics such as large language models, multi-modal large models, embodied intelligence, AI-native applications, music generation, 3D AIGC, and the industry applications of AI agents and vertical industry large models.
In addition to the Large Model Summit held in the main venue on the first day of the conference and the AI Infra Summit in the main venue today, the sub-venues of the conference also organized the Terminal-side Generative AI Technology Seminar, AI Video Generation Technology Seminar, and Embodied Intelligence Technology Seminar in these two days. 17 young scholars and technical experts brought report sharing, and the replays of these three paid seminars will be available later.
1. From Intelligent Computing Clusters to Native Acceleration Technology Stack, Focusing on Breaking Through the Computing Power Bottleneck of Large Models by Addressing the Pain Points of Industrial Landing
The development of AI has brought huge challenges in data, computing power, and energy. As the key to supporting the operation of large models and the development of generative AI applications, AI Infra has also come to the forefront with a strong development momentum.
How to build a high-quality intelligent computing center and how to achieve efficient collaboration across the entire industry chain from chips to application ends? Several guests gave their in-depth insights.
1. Dai Guohao, Associate Professor at Shanghai Jiao Tong University, Co-founder and Chief Scientist of Wuwen Xinqiong
Under the Scaling Law, data has become one of the factors restricting the continuous development of AI. The reasoning model represented by GPT-o1 can break through the data bottleneck, but the transformation of the computing paradigm makes the computing power demand increase exponentially, which may lead to an imbalance between the energy consumption and the demand of the hardware system, posing a challenge to the sustainable development of the industry.
In response, Professor Dai Guohao pointed out that the industry should pay more attention to how to achieve more efficient token throughput per unit of computing power at present. The actual available computing power of large models should not only depend on the theoretical computing power of the chip but also can improve the computing power utilization efficiency through software and hardware collaborative optimization and amplify the overall computing power scale through multi-heterogeneous adaptation. He shared the research progress and practical results of his research team in software and hardware collaboration, multi-heterogeneous adaptation, and terminal-side intelligence. These results can help the industry improve the token throughput efficiency for large model scenarios.
2. Guo Wen of Beidian Digital Intelligence: Filling the Industrial Chain Gap between the Supply and Demand Sides of Domestic Computing Power with AI Factories
"For the industry to develop, innovation should not only stay at the technical level but also be carried out in an all-round way in terms of processes, systems, and organizations." Guo Wen, the head of Beidian Digital Intelligence Computing Cloud, shared the practical thinking of comprehensively building an AI production line in the era of artificial intelligence from the aspects of computing power, algorithms, data, and ecology.
Guo Wen said that the biggest problem for the landing of domestic chips in the artificial intelligence industry at present is that there is an industrial chain gap between the supply and demand sides of computing power. To this end, Beidian Digital Intelligence has launched the first "Domestic Computing Power PoC Platform", creating an AI factory with full-stack capabilities with the Beijing Digital Economy Computing Center as the carrier, fully adapting and connecting the scenarios, models to the chip level, and promoting the transformation of the intelligent computing center from a cost center to a new quality productivity center that promotes regional development.
3. GMI Clould King.Cui: High-Stability GPU Clusters Become the Key to the Global Layout of AI Enterprises
The acceleration of China's AI going global, computing power, as the core means of production, is playing an important role. High-stability GPU clusters can reduce costs and increase efficiency, helping enterprises win in the wave of AI globalization.
King.Cui, the president of GMI Cloud Asia-Pacific, mentioned that in order to ensure the high stability of the GPU cluster, they use a self-developed cloud cluster engine with active detection function to achieve efficient allocation of computing, storage, and network resources.
GMI Cloud is NVIDIA Top10 NCP, and a strict verification process will be carried out before delivery. GMI Cloud collaborates with IDC to provide spare parts and maintenance, with a shorter delivery time to ensure minimal downtime.
4. Alibaba Cloud Cong Peiyan: Lingjun Intelligent Computing Cluster Should Not Only Achieve Stability and Extreme Performance but Also Support Extreme Expansion in Different Dimensions
Cong Peiyan, the head of the Alibaba Cloud Intelligent Computing Cluster Product Solution, predicts that the performance of future models will continue to improve with the growth of parameters, data sets, and computing power, and the Scaling Law still has room for growth. The design paradigm of AI intelligent computing clusters should shift to be GPU-centric.
Alibaba Cloud has launched the Lingjun Intelligent Computing Cluster that supports ultra-large-scale distributed training, which can reach an expansion scale of 100,000 cards, and the linear acceleration ratio of the thousand-card scale reaches 96%; Alibaba Cloud's self-developed Panjiu server adopts the separation of CPU and GPU, achieving a single machine upgrade to 16 GPUs; the network architecture HPN7.0 can connect up to 100,000 GPUs at the maximum scale.
The stability of the intelligent computing cluster is crucial. Alibaba Cloud's 3,000-card-scale intelligent computing cluster has a stable training time ratio of 99% within one month.
5. Zhou Qiang of Guangyu Xinchen: Solving the Problem of "Large Models Don't Understand You", Personal Large Models Usher in Opportunities
As a major branch that develops in parallel with general large models, industry large models, and enterprise large models, personal large models have also entered a period of rapid development. Zhou Qiang, the founder and chairman of Guangyu Xinchen, said that the personal large model solves the problem of "large models don't understand you". With the All-in AI of terminal device manufacturers such as mobile phones, PCs, wearables, and XR, the road of personal large models will become wider and wider.
He mentioned that the personal large model is also called the terminal-side large model. It is expected to solve the pain points of the terminal-side agent in terms of performance, power consumption, and cost, and bring the real AI phone into life. Terminal-side AI has five advantages: timeliness, reliability, low cost, privacy protection, and customization. At present, the core of building a terminal-side large model is to solve the dual problems of storage bandwidth and capacity.
6. Zhu Guoliang of Zhonghao Xinying: The Construction Practice of the "No CUDA" Software Stack for Domestic TPU Chips
Zhu Guoliang, the head of the Zhonghao Xinying Chip Software Stack, introduced their practical experience in building a "No CUDA" software stack for domestic TPU chips.
The Chana chip of Zhonghao Xinying adopts the VLIW instruction set architecture. Facing the huge CUDA ecosystem, they have solved the problems in libraries, parallel computing, and programming one by one, and fully self-developed the user-state and kernel-state drivers to achieve efficient management of the chip.
In order to achieve ecological compatibility, the underlying software stack of Zhonghao Xinying is compatible with PyTorch and all mainstream training and inference frameworks. Currently, Zhonghao Xinying can provide customized end-to-end cloud intelligent computing solutions and support domestic operating systems.
2. From Enterprise Intelligent Agents, Vector Databases to RAG, Many New Challenges Emerge in AI Infra Basic Software
In the afternoon session, several guests further shared industry observations and in-depth insights in the AI Infra field on intelligent agent development and management platforms, real-time voice, vector databases, vector models, RAG technology, data orchestration, and other aspects. Many new platforms, new products, and new technologies have come to the forefront to empower the industry.
1. Gao Xuefeng of Fengqing Technology: From Data to Knowledge, Bridging the Gap between Generative AI and Decision Intelligence
Gao Xuefeng, the founder and CEO of Fengqing Technology, said that the technical breakthrough point to truly apply generative AI to the enterprise decision-making scenario and bridge the gap between it and decision intelligence is to integrate symbolic logic reasoning on the reasoning framework side.