HomeArticle

Three Tsinghua alumni have raised another 600 million yuan in financing.

铅笔道2026-03-04 09:24
Another dark horse has emerged in the AI video generation track.

A dark horse has emerged in the AI video generation track.

Recently, Shenshu Technology, which was founded less than three years ago, completed a Series A+ financing of over 600 million yuan, becoming the largest single financing in the domestic video generation field.

We often complain that video creation is too laborious: high learning threshold, long time - consuming, and high cost. In fact, we lack efficient intelligent creation tools. Shenshu Technology equips video creators with an AI brain and hands, enabling ideas to automatically generate videos through AI.

Shenshu Technology has currently served a number of customers such as ByteDance, Sony, and Anta.

Since the second half of last year, the AI video generation track has been experiencing a wave of financing. Will it become the first AI sub - track to run through the business model?

01

Pain points of video large models

Shenshu Technology was founded in Beijing in March 2023. The three co - founders, Zhu Jun, Tang Jiayu, and Bao Fan, all have a background from Tsinghua University.

Zhu Jun was born in 1983 in Funan, Anhui. He graduated from the Department of Computer Science at Tsinghua University. Later, he served as the deputy dean of the Institute of Artificial Intelligence at Tsinghua University. He has long been engaged in research in the field of image generation, with a focus on the new - generation core technology - the diffusion model.

Tang Jiayu studied both his undergraduate and master's degrees at the Department of Computer Science at Tsinghua University, under the guidance of the Natural Language Processing Laboratory. After graduating with a master's degree, he joined Tencent YouTu Laboratory as a senior product manager and later served as the vice - president of RealAI, an AI infrastructure service provider.

Bao Fan is a post - 1995 generation. He was admitted to the School of Life Sciences at Tsinghua University in 2014 for his undergraduate studies. Two years later, he transferred to the Department of Computer Science and Technology. He obtained a bachelor's degree from the Department of Computer Science in 2019 and later a doctor's degree from the same university. He is a student of Zhu Jun.

In 2022, the industry's focus was still on large text - image models. Zhu Jun judged that the next breakthrough would be in video generation. He decided to develop a video large model for practical applications.

As can be seen from the above resumes, this team centered around Tsinghua teachers and students has long - term accumulation in video generation algorithms and underlying R & D.

As early as September 2022, this team proposed the U - ViT technical architecture. It can be used to generate high - quality images and videos. It can also generate 3D content and even be used to build world models.

In 2023, Shenshu Technology was founded, incubated by RealAI, and quickly received nearly 100 million yuan in angel - round financing from Ant Group and Baidu Ventures.

In 2024, based on the U - ViT technology, Shenshu Technology launched the video generation large model Vidu. This is a model specifically for video generation. It can simulate the real physical world, and the generated pictures are rich in details.

More importantly, the scene movement conforms to physical laws and looks more real.

In the early days of Shenshu Technology, the progress in productization and sales systems was slow. Although the laboratory capabilities were strong, they were not converted into stable income; the model could run, but there was no continuous payment collection.

The problem was not with the model itself. The key lies in: how to turn the model into a product and how to make customers willing to pay for it in the long term.

The turning point came in 2025.

Luo Yihang later joined the company as the CEO. He previously worked at ByteDance's Volcengine, responsible for AI solutions. He has long been engaged in To B business, is familiar with the enterprise procurement process, and understands the actual application scenarios of customers.

After taking office, he first streamlined the product structure and built a four - layer system around Vidu: MaaS, SaaS, Agent, and APP. Different levels correspond to different customers. There are clear solutions for both individual creators and enterprise platforms.

Both technology and business are promoted simultaneously.

On the technology side, Zhu Jun continued to lead the team to optimize the model's performance and efficiency. Public data shows that Vidu ranks second in the global video large - model evaluation and first in China. The generation speed has also been significantly improved, about 10 times faster than Sora2.

On the business side, Luo Yihang accelerated customer cooperation. In the film and content fields, it covers ByteDance, Sony, and CCTV Animation. In the brand marketing field, it cooperates with L'Oréal and Anta. At the same time, it cooperates with Lenovo and AMD to explore the deployment method on the AI PC side.

In 2025, the number of users and revenue of Shenshu Technology increased by more than 10 times year - on - year. It served more than 3,000 enterprises, and its business covered more than 200 countries and regions. The annual recurring revenue of Shenshu Technology has also exceeded 20 million US dollars.

Compared with the early days, the focus of Shenshu Technology has changed. The company builds a product system around video generation capabilities and is no longer just doing model research.

02

Industry turning point

In less than three years, Shenshu Technology has received 6 rounds of financing and 1 equity transfer. Enterprises such as Huawei, Ant, Baidu, and Zhipu all stand behind Shenshu Technology.

In addition to Shenshu Technology, the entire AI video generation track has received heavy investment from industrial capital. Since last year, most of the AI startups with the fastest financing speed and the fastest rising valuation in China are concentrated in the video track.

From a technical perspective, the industry turning point occurred in the second half of 2024: the architecture of multi - modal models began to shift to DiT / Transformer - based.

Once the architecture changes, the effect also changes accordingly. The "long - sequence consistency" in video generation has been significantly improved.

Simply put, the pictures no longer conflict with each other, the characters don't suddenly change their faces, and the actions are more coherent. This means that a key problem - stability - has been solved.

AI video generation has truly approached "commercial availability" for the first time.

That is to say, it is no longer just a demonstration effect. It can enter the real production process, be used as a tool, and be included in the cost and efficiency calculations.

Some startups have also started to really make money from it.

AI video generation products can shorten the video production cycle by about 80% and reduce the cost by about 90%. At the same time, it can cover various scenarios such as self - media, advertising, film and television, and e - commerce to meet the large - scale content production needs.

Data from the Economic Observer shows that in December 2025, several leading AI video companies disclosed their performance. Their revenue scales are completely different from those of a year ago.

Last year, the revenue was so small that it could almost be ignored. This year, they have entered the "100 - million - yuan club", with the lowest being about 140 million yuan and the highest approaching 1 billion yuan.

The growth rate is very obvious.

Before this round of financing of Shenshu Technology, the record for the single - round financing amount in the AI video generation track belonged to Aishi Technology. In September last year, Aishi Technology announced the completion of a Series B financing of over 60 million US dollars. Just one month later, it announced the completion of a Series B+ financing of 100 million yuan.

One of the important reasons for the investment in Aishi Technology is that its commercialization has taken shape. It is said that its global C - end user scale has exceeded 100 million, and 80% of its revenue comes from the C - end.

On the B - end, it provides API and customized video generation services for fields such as advertising, short dramas, and games. Its annual recurring revenue (ARR) in 2025 exceeded 40 million US dollars.

In December last year, the AI video generation product Pollo AI completed a first - round financing of 14 million US dollars, led by Gaocheng Capital, with ZhenFund participating.

Pollo AI has connected most of the mainstream video generation models on the market, including OpenAI's Sora, Midjourney, Vidu, Hailuo, Kling, etc.

It integrates different models on one platform and clearly marks the advantages of each model.

For example, some are good at realistic styles, some are more suitable for anime effects, and some are stronger in character stability.

Users can directly select the appropriate model according to their needs without repeated self - testing, which is more efficient and targeted.

It is worth noting that Pollo AI's commercialization progress is very fast: the number of registered users exceeds 20 million, the monthly active users exceed 6 million, the daily active users exceed 200,000, and the annualized revenue exceeds 20 million US dollars.

There are also several other companies in the industry worthy of attention.

One is Vivix AI, an AI video generation enterprise founded by Liu Yu, the research director of SenseTime. According to Z Finance, it completed a seed - round financing in February, jointly led by Sequoia China and IDG Capital. In November last year, its Series A valuation exceeded 1.32 billion US dollars.

Another company is LiblibAI, which completed a Series B financing of 130 million US dollars in October last year. Although it is not purely video generation but mainly image - based, its commercialization has also progressed smoothly, with an ARR of over 1.5 million US dollars in half a year.

03

Trend: Intensified head effect

What exactly is going on behind the financing wave of multi - modal products represented by video generation?

Liang Wei, the founder of MovieFlow, told Pencil News that the head effect of this round of financing wave is obvious. The key to obtaining financing lies in whether the application scenarios are mature and whether they can support commercial implementation.

Compared with the previous two years, the biggest change in the industry is that the audio - visual integrated model has transformed the tool from "fragment generation" to "complete content creation".

At this time, the core value of enterprises has shifted from pure technology to how to package the model into an easy - to - use product so that users can truly use it stably. Providing standardized and scenario - based delivery services is the core business opportunity.

Currently, MovieFlow provides both lightweight C - end products and MovieFlow Studio professional - edition services for film and television production. It not only builds a content community like YouTube but also has high - quality content like Netflix.

Liang Wei predicts that the industry pattern will be basically determined in the first half of 2026. The space for new entrants will be significantly reduced, and resources will be concentrated in leading enterprises with strong technology and profitability.

Deng Kun, the person in charge of the strategic investment department of Giant Network and an investor in Aishi Technology and LiblibAI, once told the Economic Observer that in 2026, better commercial benefits can be seen in fields such as e - commerce, education, short dramas, and comic dramas for AI videos. The next decisive point lies in whether an AI video product with movie - level picture quality can be developed.

This article does not constitute any investment advice.

This article is from the WeChat official account "Pencil News" (ID: pencilnews), written by Song Ge and edited by Huang Xiaogui. It is published by 36Kr with authorization.