A startup affiliated with Tsinghua University secures the largest single financing in the domestic video generation field.
According to a report by Zhidx on February 5th, today, Beijing-based multi-modal generation technology startup Shuanshu Technology announced the completion of a Series A+ financing round exceeding 600 million RMB. Shuanshu Technology also revealed that in 2025, the company achieved over 10-fold growth in users and revenue, with its users and business covering more than 200 countries and regions worldwide.
Shuanshu Technology's financing amount of 600 million RMB surpassed the previous single financing record in the domestic video generation field held by Aishi Technology (430 million RMB), becoming the largest single financing in the domestic video generation field.
This financing round was led by Zhongguancun Science City Company and Xinglian Capital. Listed companies such as Wondershare Technology, Visual China, and Thorsen Technology made strategic investments. Existing shareholders including Qiming Venture Partners, Beijing Artificial Intelligence Industry Investment Fund, Zhuoyuan Asia, C&D Emerging Investment, and Huaihai Investment increased their investments.
Shuanshu Technology was founded in Beijing in March 2023. Its co-founders Zhu Jun, Tang Jiayu, and Bao Fan all have backgrounds from Tsinghua University. Among them, Zhu Jun is the Deputy Dean of the Institute for Artificial Intelligence at Tsinghua University. In March 2025, Luo Yihang, the former head of the AI Solutions at ByteDance's Volcengine, joined Shuanshu Technology as the CEO, taking full responsibility for R & D, products, commercialization, and team management.
▲ The person on the far left in the picture is Luo Yihang, the CEO of Shuanshu Technology (Photo source: Shuanshu Technology)
01
Currently, Shuanshu Technology mainly focuses on the R & D of multi-modal general large models and their applications. Through product application forms such as SaaS, MaaS, and App, it provides video generation and multi-modal generation products for individual users, professional creators, and enterprise customers.
Shuanshu Technology is one of the earliest global teams to research multi-modal generation algorithms. In September 2022, the company's founding team proposed the U - ViT architecture, three months earlier than OpenAI's DiT architecture. In 2024, Shuanshu Technology successively launched the text - to - video large model Vidu in China and overseas.
During this financing round, Zhu Jun mentioned that multi-modal video models can not only be applied to digital content creation and interaction but also build a world model that understands the laws of the real world, supporting machine decision-making end - to - end. Shuanshu Technology will explore and break through the value of AI in the physical world.
This may imply that in the future, Shuanshu Technology might apply its multi-modal video model to physical AI scenarios such as robotics and embodied intelligence.
In addition to the model capabilities of text - to - video and image - to - video, Vidu also pioneered the "reference - based video generation" technology, which supports uploading various materials such as pictures as reference elements, enabling AI to generate videos containing these elements. This technology uses a dedicated consistency maintenance algorithm to solve the problem of continuous consistency of multiple subjects in commercial - grade video requirements.
From 2024 to 2026, the Vidu series of models underwent three iterations to improve in indicators such as multi - subject consistency maintenance, semantic understanding, dynamics, stability, expressiveness, and generation speed.
On January 30th this year, Shuanshu Technology released the Vidu Q3 model, a video generation model mainly targeting professional - level film and television production scenarios. In the latest list released by the international authoritative AI benchmark testing institution Artificial Analysis, Vidu Q3 ranked first in China and second globally, only after xAI's Grok video generation model, surpassing Runway Gen - 4.5, Google Veo3.1, and OpenAI Sora 2.
In terms of functions, Vidu Q3 supports simultaneous audio and video output within 16 seconds, 1080P picture quality, rich shot languages, precise shot changes, multi - national text rendering, and multi - language output.
02
During the financing round, Shuanshu Technology also disclosed its customer composition to the public.
In the film and television (comic dramas/short dramas/movies) industry, Vidu covers more than 90% of content providers, tool providers, and production institutions in the entire industry. Its cooperation customers and partners include Sony Pictures, Tencent Animation, iQiyi, Mango TV, etc.
In the Internet and smart hardware industries, Vidu's customers include ByteDance, Samsung, Wondershare Technology, TAL Education Group, Alipay, Honor, etc., mainly used for content production and product interaction innovation.
Shuanshu Technology also has multiple customers in industries such as advertising and gaming, and its products have also achieved certain results in the overseas market.
According to information from Qichacha, after the just - announced financing, Shuanshu Technology has completed a total of 6 financings and 1 equity transfer. Companies such as Huawei, Ant Group, Baidu, and Zhipu have all invested in Shuanshu Technology.
Baidu once led Shuanshu Technology's Pre - A round financing of hundreds of millions of RMB and participated in Shuanshu Technology's Series A financing together with companies such as Huawei and Ant Group.
Zhipu participated in Shuanshu Technology's Angel + round financing of hundreds of millions of RMB. In April 2025, Zhipu also established a strategic cooperation with Shuanshu Technology. Zhipu's MaaS platform has now connected to Shuanshu Technology's Vidu API and provides services externally.
During the angel round financing in June 2023, Shuanshu Technology was valued at 100 million US dollars (approximately 690 million RMB). In subsequent financing rounds after the angel round, Shuanshu Technology did not disclose its latest valuation.
▲ Financing history of Shuanshu Technology (Photo source: Qichacha)
03
Conclusion: Leading startups in the video generation track continue to attract capital
In the past year, the video generation track has remained hot, and startups have continued to receive huge capital support. In September last year, Aishi Technology completed a Series B financing round of 60 million US dollars led by Alibaba. During the same period, Video Rebirth, a video native world model company founded by a former outstanding scientist from Tencent, also completed a financing round of 50 million US dollars, showing the market's strong confidence in technological innovators.
However, the video generation track is also a key area for large companies. Take Keling AI under Kuaishou as an example. Its total sales in 2025 are expected to reach 140 million US dollars, and its Annual Recurring Revenue (ARR) has exceeded 240 million US dollars, demonstrating strong commercialization potential. Overseas, Google's Veo series of models quickly occupied the market with its generation effects and user reputation and continued to expand its influence through its ecosystem.
Facing strong competitors, startups in the video generation field still need to answer a key question: How to build differentiated advantages in terms of technology, scenarios, or ecosystems?
This article is from the WeChat official account "Zhidx" (ID: zhidxcom), author: Chen Junda. Republished by 36Kr with authorization.