StartseiteArtikel

Der Umsatz über 100 Millionen, dieser Multi-Modal Generative AI-Schwarze Riese startet eine neue Phase.

晓曦2026-01-20 16:33
In der heutigen Zeit, in der die Umsetzung von multimodalen Anwendungen beschleunigt wird, welche Geschäftspotenziale stecken noch in der Zukunft von Zhixiang?

It's almost been two years since OpenAI released its text - to - video model Sora, yet AIGC companies in China and the US are in completely different stages of development: On one hand, there's Sora 2, which still has high costs and has never really taken off, and the Sora app with almost zero retention rate. On the other hand, there are Chinese companies that are built on fertile application ground, are now in a stage of continuous upswing, and are experiencing a full - scale breakthrough in commercialization.

The magazine "Intelligent Emergence" recently learned that the young AI company Zhixiang Future, which specializes in visual multimodality, achieved a total revenue of over 100 million yuan in 2025. Its C - end product vivago.ai also recently reached a peak in downloads. In January 2026, nearly 10 million new users were added, and the product reached the top 10 in the "Video Playing and Editing" category in the Google Play Store in over 100 countries and regions worldwide, indicating enormous commercial development potential.

Since its establishment, Zhixiang Future has successively released the image generation model HiDream - I1 and the interaction and editing model HiDream - E1, and made them fully open - source in April 2025. Within 24 hours of its release, HiDream - I1 reached the top of the internationally renowned AI evaluation list Artificial Analysis.

This company from Hefei has found a perfect balance between generation quality and efficiency through its self - developed super - model with over 10 billion parameters and the world's first diffusion self - regression architecture. Currently, its products are widely used in areas such as culture and creativity, film and television, and advertising.

"Intelligent Emergence" exclusively learned that the financing process of Zhixiang Future is accelerating again: The Series B financing is already in the settlement phase, and the term sheet for the next round is already in hand. Core insiders close to the company revealed that both financing rounds are in the range of hundreds of millions of yuan.

In the face of increasing competition in the field of AI - powered visual generation, Zhixiang Future continues to receive strong investments from top capitalists due to its strong technical strength and a clear commercial approach.

What commercial potential lies behind Zhixiang Future in the face of the accelerated implementation of multimodality?

The Most Industrialized Scientists, the Best - Implemented Romance

From the very beginning, Zhixiang Future found a practical form of romance. The founder Mei Tao is an overseas member of the Canadian Academy of Engineering and previously worked at Microsoft for 12 years. He has published over 300 articles in the fields of multimedia analysis and computer vision and won the international best paper award 15 times.

But Mei Tao's experiences go beyond pure science. In 2018, he joined JD.com and took on the position of deputy director of the JD Exploration Research Institute. This career phase allowed him to understand the path from technology to commercial implementation.

When he decided to found Zhixiang Future, Mei Tao had a clear vision. On one hand, multimodality is the most likely path to artificial general intelligence (AGI), a view that later became the industry consensus. At the same time, multimodality offers a much broader commercial space compared to pure language models. "Currently, 50% - 60% of global AIGC revenues come from applications in the fields of images and videos, which is higher than that of pure text models. In 2023, when we decided to start the company, multimodality companies like Midjourney had already proven their strong commercial strength with SaaS tools and clearly confirmed the marketability of their products," Mei Tao told 36Kr in mid - 2025.

And this is exactly Mei Tao's arena, as he has in - depth expertise in the fields of computer vision (CV) and multimodality.

However, at that time, Sora was like a mountain in the way for Chinese start - ups. Given its ability to reproduce the physical world and the impressive results, the industry was curious to see if Chinese start - ups could achieve results that could compete with those of Sora.

So a race began. Only six months after the release of Sora, Zhixiang Future released its self - developed multimodel. In April 2025, it even made the image generation model HiDream - I1 and the interaction and editing model HiDream - E1 open - source, thus closing the loop from conversation to image creation. HiDream - I1 reached the top of the renowned list Artificial Analysis within 24 hours and became the first Chinese self - developed generative AI model to enter the global elite group. It also set records in the three dimensions of image quality, semantic understanding, and artistic expression.

Subsequently, several entrepreneurs found that Sora lags behind in architectural innovation. Mei Tao also felt at that time that the overall functions of Sora met his expectations. In the following six months, since start - ups like Zhixiang Future entered the market, OpenAI no longer has a significant advantage in video generation. Especially from the perspective of product implementation, there are hardly any differences between foreign and Chinese products.

In addition, Zhixiang Future has even reached the top in the exploration of the multimodality architecture template. The company first developed the two - model strategy of generation and understanding and then planned the integration of understanding and generation, which is regarded as the best way to the physical world.

Zhixiang Future is still on the way to solve industry problems. In 2025, after the release of the latest model and products like vivago 2.0, Mei Tao told 36Kr that the DiT (Diffusion Transformer) architecture uses the strong ability of the Transformer to process video data, enabling the AI model to efficiently model spatio - temporal relationships and flexibly generate videos with different resolutions. This is an important progress. However, for the entire field of generative AI, the realistic reproduction of complex physical phenomena remains an unsolved problem - the trajectories of splashing water droplets, the mechanical feedback in object collisions, and other dynamic details that can be perceived by human instinct are still in the stage of "similar in form but different in spirit", and there are often visual inconsistencies in relevant scenarios.

Zhixiang Future has found an excellent balance between generation results and runtime through the Sparse DiT architecture. Through presence distillation technology, it has simultaneously increased infrared efficiency and significantly improved the details and beauty of the image. This has finally led to several creative successes of Zhixiang Future's HiDream - I1 model.

Blaze New Trails in Algorithm Development to Solve the Last - Mile Problem

In contrast to large companies that focus on the development of basic models and increasing parameters, small companies pay more attention to innovation and implementation. In Mei Tao's view, this is also the value of Zhixiang Future, that is, to solve the problem of commercial implementation of AI.

He told 36Kr: "From the first day of our establishment, we were very aware of the risks and always asked ourselves how we could find the product - market fit (PMF). We entered commercialization early and quickly. Although we didn't raise the most money, we thought carefully about every yuan we spent and every employee we hired."

In the early establishment phase, Zhixiang Future developed a "1 + 3+N" layout, that is, a central multimodel drives three main products: a platform for creative tools, a tool for interactive marketing content, and a one - stop video generation agent. So far, its service covers over 20 million individual users and over 40,000 corporate users worldwide.

After determining the positioning, it is important to improve the delivery and customer service so that the AI can actually create value.

Mei Tao told 36Kr that Zhixiang Future has the most comprehensive multimodal licensed language data, hundreds of thousands of hours of licensed video materials, and thousands of licensed IPs in China. This includes not only 70% of Chinese film and television data but also hundreds of millions of AIGC post - processing materials, which are currently widely used in scenarios such as film and television, culture and tourism, and marketing.

"At the Microsoft Research Institute, we always said that it might take a hundred engineers to turn a technology into a product; and to sell the product well, it might take another hundred solution experts or business development specialists. This shows how big the gap is. At that time, I thought I had to go somewhere to close the chain."

It is precisely this ability to cover the entire chain from technology to implementation that has been highly appreciated by capitalists since Zhixiang Future's establishment.

In 2024, Zhixiang Future completed a Series A financing in the range of hundreds of millions of yuan, led by the Hefei Industrial Investment Group and participated in by institutions such as the Anhui Artificial Intelligence Mother Fund. At the end of 2025, JD Group, as a strategic investor, increased its investment in Zhixiang Future. The huge business scenarios in areas such as logistics, retail, healthcare, and industry behind JD Group are the perfect testing ground and application field for multimodal AI technology.

Subsequently, insiders said that Zhixiang Future has accelerated the preparations for the Series B financing and plans to complete the settlement in January 2026.

36Kr recently learned that Zhixiang Future has already received the term sheet for the next round. The old shareholders have continued their support, and the new shareholders include industrial capital, listed companies that can conduct in - depth cooperation, and well - known investment institutions. Currently, the Series B financing has already reached an amount of hundreds of millions of yuan.

Yuan Guoliang, CEO of Shanghai Dunhong Asset Management, described Zhixiang Future as follows: "We are firmly convinced that video generation technology, as a new productivity tool, will revolutionize all industries. Especially in the e - commerce field, video has already become the central medium connecting goods with consumers. HiDream has already confirmed the application and commercial value in the e - commerce scenario through its products, which shows that the team not only understands the technology but also the industry. At the same time, we believe that its technological architecture and development path have the possibility to develop into a more universal and in - depth world model, which corresponds to a leap in basic capabilities. We look forward to exploring the long - term path of technology - industry integration with the team and promoting multimodality generation as a universal and intelligent industry infrastructure."

The Best Example of the Combination of Commercial Strength and Architectural Innovation

The year 2025 was the breakout year for Chinese multimodality - generative AI. With the increasing maturity of AIGC technology, productivity and creativity have been significantly improved, leading to an explosive growth in the application market. According to IDC data, the global market for generative AI will grow at an average annual growth rate of 63.8% in the next five years and reach $284.2 billion by 2028, accounting for 35% of total AI investment. As one of the winners, Zhixiang Future has benefited from its strong technical strength and industrial implementation method. The company's commercial process has developed rapidly, and 36Kr has learned that Zhixiang Future's annual revenue in 2025 already exceeded 100 million yuan.

The rapid achievement of such results in the highly competitive field of multimodality generation is due to Zhixiang Future's unique business model thinking and strong basic innovation. It can be said that Zhixiang Future is one of the few companies in the industry that masters both commercialization and technological innovation.

In the three years since its establishment, Zhixiang Future has gone through different business models. In 2023, the model was MaaS (Model as a Service), where models and APIs were sold, similar to the PaaS model in the cloud - computing industry. In 2024, it switched to the SaaS model (Software as a Service), where mainly tools were sold so that users could produce content on Zhixiang Future's platform.

Today, it has upgraded its model and officially changed to RaaS (Result as a Service), a user - value - oriented business model where tools, content materials, and limited video production/placement are offered for only a low basic fee, and it mainly profits from the commission from the increased GMV of customers. According to Mei Tao, the benefits for customers are relatively clear, and basically risk - free implementation and shared participation in the increased profit can be achieved.

With the increasing success of the start - up, Mei Tao also said that he has found a balance between commercial backlog and ability improvement. On one hand, he continuously increases investments and engages intensively in the research of vertical basic models. A stronger and more advanced foundation will surely provide a better basis for model capabilities.