HomeArticle

Cutting Costs for AI: Moortec Secures Nearly 1 Billion Yuan in Series C Financing

36氪的朋友们2026-05-28 11:40
Whoever can reduce the cost of token generation may get the ticket to the next round of competition.

The battle for AI computing power is entering a new stage. The number of parameters in large models is increasing from hundreds of billions to trillions, and the number of tokens required for a single inference is growing exponentially. Inference cost has become the core bottleneck restricting the large - scale commercial implementation of the entire industry. Whoever can reduce the token generation cost may get the entry ticket for the next round of competition.

"Based on the advantages of sparse computing, the single - token cost of MoXin's products can be far lower than that of mainstream competitors," said Wang Lvyu, the secretary of the board of directors of MoXin Artificial Intelligence and the general manager of the Corporate Development and Capital Market Department, giving this answer.

What is sparse computing?

In short, traditional AI chips use the "dense computing" mode, performing operations on all parameters in the matrix equally, where a large amount of computation is spent on processing invalid or redundant data. Sparse computing, on the other hand, pre - identifies and skips these "useless efforts" through algorithms and only calculates the truly effective parameters. Thus, under the same hardware conditions, it can significantly improve the effective computing power and reduce energy consumption and costs.

Recently, Touzhongwang learned that MoXin, which focuses on providing AI computing power platforms for cloud and terminal devices, has completed a nearly one - billion - yuan RMB Series C financing. Institutions such as Shenzhen Capital Group, Greater Bay Area Common Home Fund, Liding Capital, and Yunsheng Capital have participated, and old shareholders such as Triumph Venture Capital, Shengjing Jiacheng, and Yanshan Technology have continued to increase their investments.

"The current industry is still in a stage of rapid expansion and increased capital investment, and the overall market opportunity is huge," Wang Lvyu described the current AI computing power market. "However, inference cost is the core factor determining whether an enterprise can survive the industry cycle and enter the next round of competition."

In the domestic AI chip track, MoXin has taken a differentiated innovation path: relying on self - developed sparse algorithms to "subtract" from chip computing, and through the collaborative design of algorithms, software, and hardware, it has achieved a truly innovative computing power solution to optimize the generation cost of each token to a higher level.

The One Who "Does Subtraction"

In 2018, two alumni from Carnegie Mellon University met in Silicon Valley and jointly embarked on the entrepreneurial journey of AI chips, founding MoXin Artificial Intelligence.

Wang Wei, the founder and CEO, is a master of ECE from Carnegie Mellon University and a Silicon Valley chip expert with over 15 years of experience. He has served as the core architect of Intel's fifth - to tenth - generation CPU processors, worked at Qualcomm and Intel successively, and the chips he has led and participated in have a cumulative mass production of over 5 billion units.

Dr. Yan Enxu, the co - founder and chief scientist, also graduated from Carnegie Mellon University. He has been deeply involved in the field of machine learning for more than a decade and is the inventor of the neural network dynamic sparse algorithm, creating the dual - sparse algorithm. This is a revolutionary idea to further improve AI computing efficiency through "weight sparsification + activation sparsification" of neural networks. Dr. Yan Enxu has published more than 40 papers in top international AI journals in related fields.

Another co - founder, Lu Yong, graduated from the Department of Electronic Engineering of Zhejiang University. He has worked at well - known semiconductor companies such as SK Hynix and Marvell and led the development of multiple globally mass - produced SSD controller chips.

The three form a perfect complementary combination of capabilities. One is good at product architecture, one is proficient in algorithm innovation and system optimization, and one is well - versed in hardware engineering and product implementation. The three unanimously believe that sparsification is the future of AI computing and have gradually implemented and iterated the sparsification theory at the cutting - edge academic level into a commercializable computing power solution.

MoXin's original dual - sparse algorithm, through software pre - optimization, eliminates invalid and non - core computing elements in the model and transforms computing tasks into efficient and accurate sparse computing tasks.

"A large number of parameters in AI models are in a zero - valued state and do not participate in the calculation. The core of sparsification technology is to enable AI models to achieve true on - demand computing," explained Wang Lvyu.

This idea is not a new concept in the industry, but MoXin is the first to turn the "weight sparsification + activation sparsification" dual - sparse route into a mass - produced product and a software - hardware collaborative solution and commercialize it first. So far, MoXin has applied for more than 100 global relevant patents.

More convincing verification comes from the international authoritative AI benchmark test MLPerfTM. MoXin's S30 computing card has topped the MLPerfTM inference list three times in a row, and its hardcore technical strength has also been verified by international authoritative institutions.

And the pace of capital entry is the most honest vote.

As early as around the tape - out of MoXin's first - generation chips, well - known financial institutions such as Shenzhen Angel Mother Fund, Triumph Venture Capital, Jiangmen Investment, ZhenFund, and Jushi Capital have successively increased their investments.

But the real turning point occurred in 2024. As large models shift from technological competition to commercial implementation, capital has begun to intensively bet on computing power enterprises with implementation capabilities.

Behind the accelerated financing lies the breakthrough progress of the products. Based on the sparsification - optimized AI inference solution, MoXin has verified its commercial value in multiple real - world scenarios. Through actual tests, the AI inference business equipped with MoXin's computing card not only significantly reduces the overall inference cost but also increases the inference speed several times.

According to IDC's prediction, the proportion of inference workload will reach 73% in 2028. At the industrial turning point where the Chinese AI computing power market is shifting from "training - oriented" to "inference - oriented", the core ability of extreme cost - reduction and efficiency - improvement has become MoXin's most solid core competitive barrier in the commercial implementation stage.

"Moat" and "Acceleration"

Is it possible for other competitors to quickly copy or bypass MoXin's sparse computing technology route?

Wang Lvyu believes that MoXin has three layers of core barriers:

The first layer is the patent barrier. MoXin has carried out a global PCT patent layout since its establishment in Silicon Valley, covering all dimensions of hardware, algorithms, and software.

The second layer is engineering accumulation. Although the sparsification theory is public, it takes years of systematic investment to actually implement the algorithm into a mass - produced chip with software - hardware collaboration. Since the tape - out of MoXin's first - generation chips in 2021, it has completed scenario adaptation with three types of benchmark customers, namely Internet companies, vertical industry players, and intelligent computing centers, over a period of three to four years.

The third layer is the first - mover advantage in the ecosystem. Sparse computing is not a single - chip technology but an entire collaborative system covering chips, compilers, toolchains, and customer models. MoXin has been deeply involved in the industry for many years and has formed in - depth cooperation with many large customers. This time cost and trust barrier cannot be broken overnight.

It is this barrier that gives MoXin unique confidence in its commercial implementation.

So far, MoXin has strategically expanded its intelligent computing center clusters in the four major regions of Northwest, Southwest, East, and North China. The thousand - card - level inference cluster deployed in the Northwest region has implemented multiple factory security projects in scenarios such as electronic manufacturing and consumer goods production, achieving real - time AI analysis at the edge. The Southwest region combines local green power resources to build a low - power green computing power pool. The East region targets high - end service industries such as bioinformatics analysis and healthcare and cooperates with industry leaders to accelerate the gene sequencing data analysis process. The North region empowers urban governance and community intelligent upgrading.

To survive the cycle, relying on a single market is not enough. When asked what kind of company MoXin wants to be, Wang Lvyu relayed the vision of founder Wang Wei: to make MoXin the leader in sparse computing, rely on technological innovation to reduce AI inference costs, and empower the popularization of AI for all with sparse computing.

It is reported that the financing funds will be mainly invested in the mass production and commercialization of the new - generation computing card SparsePrime®, as well as the further expansion of the national computing power network layout.

"Inference cost is the key bottleneck for the popularization of AI, and sparse computing is providing a fundamental solution. From an investment perspective, when evaluating the value of an AI chip company, one should not only look at the theoretical computing power of a single card but also at its effective computing power and energy - efficiency ratio in completing the same AI tasks in a real - world cluster environment. MoXin's multi - location deployment and continuous expansion of customers are a hardcore verification of its product strength and commercial value," said Wang Lvyu.

This article is from the WeChat official account "Touzhongwang", author: Li Man, editor: Wang Qingwu, published by 36Kr with authorization.