HomeArticle

Interview with Zhong Hao, the person in charge of Baidu Wenku: How does AI reshape the content creation ecosystem?

未来一氪2025-08-06 14:13
Baidu Wenku undergoes AI reconstruction, and GenFlow scheduling technology makes its debut at WAIC 2025, enabling intelligent creation in multiple scenarios.

As large models transition from the technological singularity to the industrial foundation and intelligent agents move from laboratories to production lines and clinics, the third wave of artificial intelligence is reshaping the global economic fabric with unprecedented sharpness.

China demonstrates dual advantages in this transformation: it is not only a testing ground with ultra-large-scale application scenarios but also launches attacks in deep waters such as chip breakthroughs and algorithm open-sourcing. From breaking through single-point technologies to ecological-level innovation, from efficiency tools to new-quality productivity engines, an AI development path with Eastern characteristics is accelerating to emerge.  

On July 26, the World Artificial Intelligence Conference (WAIC 2025), themed "The Intelligent Era: Global Solidarity," gathered technology giants, academic pioneers, and policy-makers in the field of AI. This super feast covering technology, ethics, and art indicates that AI has evolved from an "industrial variable" to a "civilizational constant."  

At this grand event about the artificial intelligence industry, 36Kr not only acts as an industry observer but also deeply participates as an industry connector. It set up the "Krypton Star Live Studio" in the exhibition hall to uncover the underlying logic of the advancement of the artificial intelligence industry through dialogues.

During the dialogue, Zhong Hao, the product director of Baidu Wenku, said: After being reconstructed by AI, Baidu Wenku can fully and efficiently understand users' intentions, meet their needs, and solve users' problems in a one-stop, end-to-end manner. Baidu Wenku has adhered to the MoE (Mixture of Experts) architecture from the very beginning and conducts global scheduling through Genflow. The starting point is to make AI learn and work like a human being, become the best partner of humans, and help users better create and consume content.

The following is the transcript of the dialogue, edited by 36Kr:

36kr: First, please introduce yourself and your business.

Zhong Hao: I'm the product director of Baidu Wenku. People are quite familiar with Baidu Wenku as a database. After more than two years of AI reconstruction, we have well integrated a lot of AI capabilities with professional data content. Now, Baidu Wenku is a one-stop AI content acquisition and creation platform with hundreds of multi-modal AI Agents that can help users solve many creation problems end-to-end. Baidu Wenku has taken on a new look. You can experience it on the APP or the PC web page.

36kr: How did you feel about coming to WAIC this time? What were your gains?

Zhong Hao: Baidu Wenku's network disk also participated in the exhibition this year. The development of the AI industry is indeed rapid. First, large models themselves are evolving. Whether it's text-to-text, text-to-image, or video modality, the boundaries of the models have been greatly expanded compared with previous years, and many interesting applications can be seen. In addition, both startups and large companies have strengthened their determination to invest. This year's exhibition is a flourishing one, with both areas of consensus and areas that break the boundaries. Personally, I hope to see more attempts that break the boundaries and are more imaginative, rather than getting stuck in local optima or short-term consensus solutions too early.

36kr: Which exhibition booth impressed you the most?

Zhong Hao: Each booth has its own characteristics. Baidu Wenku's network disk mainly showcases scenario-based and end-to-end solutions to specific problems. Initially, we focused on the adoption rate and usage rate of the content finally delivered by users, whether it was created by AI alone or jointly by AI and humans.

The booths of Baidu Wenku and the network disk are designed according to user scenarios. Whether it's for learning, work, life, or entertainment, there are sub-scenario solutions for Baidu Wenku and the network disk in different scenarios, and users can feel that their needs are fully met.

36kr: In the process of Baidu Wenku's AI reconstruction, what do you think is the biggest technical difficulty? Which AI function is the most useful?

Zhong Hao: The most useful and deeply developed function is the intelligent PPT. We were the first in China to develop this capability, and now we have made in-depth refinements for more than a dozen scenarios. Users can not only generate PPTs based on instructions but also based on pictures, documents, materials, and even authorized personal network disk content. We also support uploading templates and customizing templates, and can directly generate charts, data, etc.

In the scenario of intelligent PPT, we have developed it deeply and comprehensively. Based on the intelligent PPT, we have applied the same idea to many other scenarios. Now, Baidu Wenku can generate long and short texts, research reports, mind maps, AI picture books, posters, etc.

Regarding technical difficulties, the AI reconstruction of Baidu Wenku is not limited to the construction of the AI Agent itself. It is more about understanding users' intentions and making AI smartly schedule multiple Agents to solve problems. For example, a freshman who has just entered university needs to plan a club activity plan. He may not clearly state that he needs a PPT, a poster, or a planning document. AI needs to actively provide comprehensive delivery based on professional content and similar scenario solutions and schedule different Agents to complete the task.

This requires higher technical requirements. You need to be able to understand users, deeply explore the intentions and demand boundaries behind the queries, and figure out how to solve users' problems efficiently and reasonably. This requires more capabilities in scenario mining, understanding of scenario requirements, and scenario solutions, with deeper technical requirements and a smarter AI. At the same time, it requires fine-grained and flexible Agents, as well as infrastructure support such as AI readers and editors to complete more comprehensive task challenges.

36kr: As a national-level product, what disruptive changes has Baidu Wenku brought after the AI-native reconstruction?

Zhong Hao: The biggest change is that in the past, as a database, Baidu Wenku only solved a small part of users' needs. When users came to Baidu Wenku, they found a piece of content they needed, downloaded it, and that was it. Their demands were not fully met. Now, Baidu Wenku can not only efficiently find content but also complete comprehensive tasks from scratch or based on existing content. On the one hand, there is real-time human-machine interaction with AI to understand the needs. On the other hand, with the help of infrastructure such as the AI integrated editor, users can create while thinking and schedule AI to complete tasks. With this infrastructure, AI can solve problems automatically for you, and the integrated editor will help semi-automatically. Users don't need to switch between multiple software and can quickly complete their work in one application. The product has changed from a database to a one-stop platform, which is what users really want.

36kr: When large models are combined with AI technology, people will consider the "impossible triangle" problems such as cost, effect, and latency. How does Baidu Wenku's network disk solve these problems?

Zhong Hao: Baidu Wenku adopted the MoE architecture during the reconstruction. At that time, Prompt engineering was popular in the industry, but we found that it was difficult to break through the boundaries of model capabilities and would encounter the impossible triangle problem. We first optimized Agents according to specific scenarios, such as intelligent PPT, long and short texts, research reports, mind maps, etc. We found the best balance point in the refinement of each scenario and then locally solved the impossible triangle problem. For example, when inserting pictures in a PPT, whether to generate pictures or directly retrieve content from Baidu Wenku, the latter may be faster and better. Through the detailed refinement of each sub-scenario, we gradually found the best balance point and enabled users to obtain high-quality delivery quickly and well.

On this basis, we launched the GenFlow scheduling center, and the 2.0 version will be launched soon. GenFlow can automatically schedule Agents to solve problems according to users' intentions. For the same demand, using Plan A is more "efficient, cost-effective, and of high quality" than Plan B. At the bottom is MoE, above which is the refinement of segmented scenario Agents to achieve local optimal balance in hundreds of scenarios, and then global scheduling is carried out through Genflow. The more you understand users' needs and the intentions behind the queries, the better you can solve problems. The key lies in how close you are to the scenarios and whether you are willing to sink down and refine.

Many products in the industry have encountered the impossible triangle problem and are eager to find a general solution to solve all users' problems. However, this is actually very difficult. To find the optimal balance point in the short term, you need to sink down and refine the scenarios. It's difficult, but it gets you closer to the ultimate goal.

36kr: Actually, it's about flexible adjustment within specific scenarios.

Zhong Hao: Yes, by using PMF to meet users' needs. For example, when Baidu Wenku's research report function was just launched, it was to solve users' needs for analysis and research. How do we ensure that when users only need a simple analysis, we don't generate a research report of tens of thousands of words? Although the function was very advanced at that time, it may not be what users wanted. Users may only need a simple analysis, so we avoid generating redundant content, which not only saves costs and time but also solves the impossible triangle problem. In many cases, the answers lie in the details of the scenarios. If you are willing to sink down and refine, you will get closer to the goal.

36kr: Actually, being able to achieve this is still based on Baidu's accumulation in AI technology.

Zhong Hao: Yes, we started investing the earliest and adhered to the MOE architecture from the beginning. As an application-side product, we explored the boundaries based on the Wenxin series of models. Finally, we turned good ingredients into a delicious meal. It not only depends on technical accumulation but also on getting close to users and scenarios to maximize the role of technical accumulation.

36kr: How does GenFlow convert models into productivity? What are the advantages in multi-agent collaboration?

Zhong Hao: Our initial idea for GenFlow was to solve the problem that users don't know about the many capabilities of the platform. Many users don't know that Baidu Wenku's network disk has hundreds of capabilities. When users interact with AI, they come with demands. Geeks will explore deeply, but ordinary users don't need to dig out all the functions. AI should provide services more actively, solving the problem that users can only use the functions after clearly describing the prompt or finding the entrance. We hope that AI can work actively like a human being, allowing you to ask more follow-up questions and give suggestions during the interaction, and providing solutions in parallel, being able to handle many tasks multi-threadedly. Only AI that meets the above characteristics can be regarded as active AI.

Therefore, the 2.0 version of GenFlow will soon launch an intervention mode, a parallel mode, and the ability to think actively. It can actively explore users' past communication content and memory banks, understand the needs behind users' conversations, and handle N tasks in parallel.

This is our innovation in the industry. Currently, most AI solves tasks serially, but humans can handle work in parallel. We believe that AI can also do so. So we have realized parallel processing and launched an intervention mode that allows users to interrupt, supplement, and modify content at any time. AI understands users' historical conversations and authorized materials and actively and flexibly solves problems, enabling smooth conversations during the process. This improvement in initiative maximizes the technical capabilities of GenFlow. Our starting point is to solve the problem of human-machine interaction, and the goal is to make AI serve humans more actively and comprehensively.

36kr: Looking ahead to the next 3 to 5 years, as technology becomes more and more mature, what new changes will occur in the entire industry? Will there be any new strategic deployments?

Zhong Hao: In the future, human-machine interaction will definitely become more in-depth and comprehensive, penetrating into all aspects of human work, study, life, and entertainment. The boundaries of interaction methods will also become more blurred, and more tasks will be completed jointly by humans and AI.

Just like why Baidu Wenku and the network disk launched GenFlow. First of all, we hope that AI can think, learn, and work like a human being, becoming the best partner of humans. In terms of layout, we have always been laying out along the main channel of content, from the starting point of content production to the end point of content consumption, which is what Baidu Wenku and Baidu's network disk have been doing. We hope that AI can help everyone better create and consume content.

One day in the future, we will try new forms of work and study. Maybe we don't need a computer. We can take out a device from our pocket and complete the work that used to take one or two weeks in 3 - 5 minutes.

For example, when relaxing and having fun, we can easily convert novels into animations without having to read the text line by line, letting AI realize our imagination. When we read many literary works, we also have our own ideas, and at this time, we become creators.

AI is my best helper. Since I haven't learned painting and I'm not a professional editor, AI can help me achieve my ideas like a team, minimizing the threshold for creation and allowing everyone with creativity to fully express their ideas and be seen by more people.

While understanding the needs of each user, AI recommends better content to them, improving efficiency and saving time. People can give full play to their creativity through a lower-threshold AI platform, and the content produced will be consumed by more people. I'm really looking forward to such a new world.