StartseiteArtikel

Der Umsatz hat eine Milliarde überschritten. Dieser junge Star der multimodalen generativen KI startet eine neue Phase.

晓曦2026-01-20 16:33
In der heutigen Zeit, in der die Implementierung von Multimodalkommunikation in Anwendungen beschleunigt wird, welches Geschäftspotenzial verbirgt sich noch hinter ZhiXiang Future?

Nearly two years have passed since OpenAI released its text - to - video model Sora, yet the AIGC companies in China and the United States are in completely different development states: On the one hand, there is Sora 2, whose costs remain high and which has never been marketed on a large scale, as well as the Sora app with a retention rate of almost zero. On the other hand, there are Chinese companies that build on fertile application fields, continuously improve themselves and are now experiencing a comprehensive commercial explosion.

The magazine "Intelligent Emergence" recently learned that Zhixiang Future, a generative AI startup specializing in visual multi - modality, achieved an annual turnover of over 100 million yuan in 2025. Its C - end product vivago.ai also recently reached a peak in downloads. In January, nearly 10 million new users were added, and it ranked among the top 10 in the "Video Playback and Editing" category in the Google Play Store in over 100 countries and regions worldwide. This shows its enormous commercial development potential.

Since its establishment, Zhixiang Future has successively released the image - generation large - language model HiDream - I1 and the interaction - editing model HiDream - E1, and made them completely open - source in April 2025. Within 24 hours after the open - source release, it topped the international authoritative AI evaluation list Artificial Analysis.

This company from Hefei has found a perfect balance between generation quality and efficiency through its self - developed large - language models with over a hundred billion parameters and the world's first - developed diffusion - autoregression architecture. Currently, its products are widely used in areas such as culture and creativity, film and television, and advertising.

"Intelligent Emergence" exclusively learned that the financing process of Zhixiang Future is accelerating again: The Series B financing is already in the final stage, and the term sheet for the next round is already available. People close to the company said that both financing rounds are in the range of hundreds of millions of yuan. Given the increasing competition in the field of AI visual generation, Zhixiang Future continues to receive strong investments from top investors due to its strong technological strength and a clear commercial path.

What more commercial potential does Zhixiang Future hold, given the accelerated implementation of multi - modal applications?

The Most Industrialized Scientists, the Best - Implemented Romance

Since its establishment, Zhixiang Future has found a practical form of romance. The founder Mei Tao is a foreign academician of the Canadian Academy of Engineering and previously worked at Microsoft for 12 years. He has published over 300 articles in the fields of multimedia analysis and computer vision and won the award for the best international article 15 times.

But Mei Tao's experiences are not limited to the academic world. In 2018, he joined JD.com and became the deputy director of the JD Research Institute. This career path enabled him to understand the way from technology to commercial implementation.

When he decided to establish Zhixiang Future, Mei Tao had clear ideas. On the one hand, multi - modality is the most likely path to artificial general intelligence (AGI), a view that later became the industry consensus. At the same time, compared with pure language models, multi - modality offers a much larger commercial space. "Currently, 50% - 60% of the global AIGC turnover comes from image - and video - related applications, more than from pure text models. When we made the decision to establish the company in 2023, multi - modality companies like Midjourney had already proven their strong commercial ability with SaaS tools and clearly confirmed the marketability of their products," Mei Tao said to 36Kr in mid - 2025.

And this is exactly Mei Tao's main battlefield, as he has in - depth expertise in the fields of computer vision (CV) and multi - modality.

However, for Chinese innovative companies at that time, Sora was like a mountain in front of them. Given its fidelity to the physical world and the impressive effects it achieved, there was great interest in the industry at that time in whether Chinese startups could achieve results comparable to those of Sora.

So a race began. Only six months after the release of Sora, Zhixiang Future released its self - developed multi - modality large - language model. In April 2025, Zhixiang Future even suddenly made the image - generation large - language model HiDream - I1 and the interaction - editing model HiDream - E1 open - source and closed the loop from conversation to image creation. HiDream - I1 topped the authoritative list Artificial Analysis within 24 hours and became the first Chinese self - developed generative AI model to enter the global top group. It set new industry records in the three dimensions of image quality, semantic understanding, and artistic expression.

Subsequently, several entrepreneurs found that Sora was rather backward in terms of architectural innovation. Mei Tao also felt at that time that the overall functions of Sora met his expectations. In the following six months, since startups like Zhixiang Future entered the market, OpenAI no longer has a great advantage in the field of video generation. Especially from the perspective of product implementation, there are hardly any differences between foreign and Chinese products.

At the same time, Zhixiang Future is even leading in the exploration of multi - modality architectural paradigms. The company first developed a dual model for generation and understanding and then planned the integration of generation and understanding. This is regarded as the best way to the physical world.

Zhixiang Future also continues to solve industry problems. In 2025, with the open - source release of the latest model and the release of products like vivago 2.0, Mei Tao told 36Kr that the DiT architecture (Diffusion Transformer) uses the strong capabilities of the Transformer to process video data. This enables the AI model to efficiently model temporal and spatial relationships and flexibly generate videos of different resolutions. This is an important progress. However, for the entire generative AI industry, the realistic reproduction of complex physical phenomena remains an unsolved problem - the trajectories of splashing water droplets, the mechanical feedback in object collisions and other dynamic details that are accessible to human intuition are still in a research phase where the results look similar but not really real. In corresponding scenarios, there are still optical inconsistencies.

Zhixiang Future has found an excellent balance between generation results and runtime with the Sparse DiT architecture. Through the adversarial - distillation technology, the inference efficiency was increased, and at the same time, the details and beauty of the images were greatly improved. This finally led to several groundbreaking successes of the HiDream - I1 model of Zhixiang Future.

Explore New Ways in Algorithm Development and Solve the "Last Mile" Problem

In contrast to large companies that rely on basic models and parameters, small companies attach more importance to innovation and implementation. In Mei Tao's view, this is also the value of Zhixiang Future, that is, to solve the last - mile problem in AI implementation.

He told 36Kr: "From the first day of our company's establishment, we were very aware of the dangerous situation and thought about how to find the product - market fit (PMF). We took the commercial path early and quickly. Even though we didn't receive the most financing, we carefully considered every yuan invested and every employee hired."

In the early stage of establishment, Zhixiang Future developed a "1 + 3+N" strategy, that is, one core multi - modality large - language model drives three products: a platform for creative tools, tools for interactive marketing content, and a one - stop agent for video creation. So far, its services cover over 20 million individual users and over 40,000 enterprise users worldwide.

After determining the positioning, the key point is how to deliver products and services well and serve customers well so that AI can actually create value.

Mei Tao told 36Kr that Zhixiang Future has the most comprehensive multi - modal licensed language databases in China, hundreds of thousands of hours of licensed video materials, and thousands of licensed IPs. It not only covers 70% of Chinese film and television data but also has created hundreds of millions of AIGC secondary creation materials, which are currently widely used in scenarios such as film and television, culture and tourism, and marketing.

"At the Microsoft Research Institute, we often said that it might take a hundred engineers to transform a technology into a product, and another hundred solution experts or business developers to market the product well. You can see how big the gap is. At that time, I thought I had to find a place where I could close this chain."

It is exactly this ability to bridge the entire chain from technology to implementation that has enabled Zhixiang Future to be well - regarded by investors since its establishment.

In 2024, Zhixiang Future completed a Series A financing in the range of hundreds of millions of yuan, led by the Hefei Industrial Investment Group and also participated in by institutions such as the Anhui Artificial Intelligence Mother Fund. At the end of 2025, the JD Group invested in Zhixiang Future as a strategic investor. The huge business scenarios in the fields of logistics, retail, healthcare, and industry behind the JD Group are the ideal test ground and application area for multi - modal AI technology.

Then, informed sources said that Zhixiang Future was intensively preparing for the Series B financing and planned to complete it at the beginning of 2026.

36Kr recently learned that Zhixiang Future has already received the term sheet for the next round. The old shareholders continue to increase their stakes, and the new shareholders include industrial capital, listed companies with which in - depth business cooperation is possible, and well - known investment institutions. Currently, the Series B financing has already reached an amount of hundreds of millions of yuan.

Yuan Guoliang, CEO of Shanghai Dunhong Asset Management, said about Zhixiang Future: "We are firmly convinced that video generation technology, as a new productivity tool, will fully enrich all industries. Especially in the e - commerce field, video has become the core medium between goods and consumers. HiDream has preliminarily confirmed its application value and commercial potential in the e - commerce scenario through its product. This shows that the team not only understands the technology but also knows the industry well. At the same time, we believe that its technological architecture and development direction have the possibility to develop into a more universal and in - depth world model. This is a leap in the most basic level of capabilities. We are looking forward to exploring the long - term path of technology - industry integration together with the team and promoting multi - modal generation as an omnipresent and intelligent industry infrastructure."

The Best Investment Target with Commercial Strength and Architectural Innovation

The year 2025 was the year of the explosion of Chinese multi - modal generative AI. With the increasing maturity of AIGC technology, productivity and creativity have been significantly improved, leading to an explosive growth of the application market. According to IDC data, the global market volume of generative AI will grow at an average annual rate of 63.8% in the next five years and reach 284.2 billion US dollars by 2028, accounting for 35% of the total AI investment. Zhixiang Future is one of the beneficiaries due to its strong technological strength and industrial implementation mindset. The company's commercial progress is rapid. 36Kr learned that Zhixiang Future's annual turnover in 2025 exceeded 100 million yuan.

The rapid achievement of such results in the competitive field of multi - modal generation is due to Zhixiang Future's unique business model considerations and its strong underlying innovation ability. It can be said that Zhixiang Future is one of the few companies in the industry that focuses on both commercialization and technological innovation.

In the three years since its establishment, Zhixiang Future has gone through different business models. In 2023, the model was MaaS (Model as a Service), where models and APIs were sold, similar to the PaaS model in cloud computing. In 2024, it switched to the SaaS model (Software as a Service), where mainly tools were sold so that users could create content on the Zhixiang Future platform.

Now it has updated the model to RaaS (Result as a Service), a business model focused on user values, where tools, content materials, and limited video production/delivery are only associated with low basic fees. The main profit comes from the commission of the increased GMV of customers. According to Mei Tao, the benefits for customers are relatively clear, and basically, a risk - free investment and a joint participation in the additional profit can be achieved.

With the increasing success of the company, Mei Tao also said that he has found a balance between commercial profit and ability improvement. On the one hand, he continuously increases investments and conducts research on vertical basic models. A stronger and more advanced underlying architecture will surely create a better foundation for model capabilities. In addition to internal research, Zhixiang Future also opens itself up to the outside world through open - source projects.