HomeArticle

Baidu is changing itself.

晓曦2025-07-02 21:36
On July 2nd, Baidu officially announced at AI Day that its search box had been upgraded to an "Intelligent Box". In terms of the basic logic, Baidu Search has a clearer presentation: a new ecological architecture consisting of the bottom layer (LLM + video generation model), the middle layer (MCP + agent tools + real - person services), and the upper layer (Intelligent Box + Baikan + upgraded AI assistant).

In people's impression, what is a search like? A box, a sentence, a question, and you can get all the answers from this search box.

What if this search box has its own "will"?

What if the "answers" are no longer just simple aggregations of text?

More than a month ago, Google made a splash at the I/O Conference with its full suite of AI products. One of the most crucial changes was the announcement that AI search had "graduated" from the lab and officially entered people's lives. Google CEO Sundar Pichai said at the conference that the AI mode can make the search results two to three times, and sometimes even five times, as long as those of traditional Google searches. In the AI mode, complex data visualization, proxy checkout, AI shopping try - ons, etc. can be carried out.

People can't help but sigh that AI has profoundly affected everyone's life. A month later, Baidu also, in a similar vein, underwent its biggest overhaul in nearly a decade, presenting a brand - new search box to the public.

On July 2nd, Baidu officially announced at the AI Day that its search box had been upgraded to an "Intelligent Box". In terms of basic logic, Baidu has a clearer presentation: a new ecological architecture consisting of the bottom layer (LLM + video generation model), the middle layer (MCP + agent tools + real - person services), and the upper layer (Super Intelligent Box + Baikan + upgraded AI Assistant).

More specifically, the bottom layer is supported by excellent external models such as the Wenxin Big Model, the video generation model Muse Steamer, and DeepSeek. The middle layer involves the invocation of various capabilities to help Baidu and players within its ecosystem improve their product and service experiences. The upper layer consists of the entrance to the Intelligent Box and other products directly targeted at the consumer end, such as Baikan and the AI Assistant.

It is reported that the updated Intelligent Box can support text input of over a thousand characters. The capabilities of taking photos, voice input, and video input have also been comprehensively enhanced, and it supports direct access to tools such as AI writing and AI image generation.

The "Baikan" function on the search results page has been upgraded. It not only supports the output of mixed content including text, images, audio, and video but also integrates capabilities such as intelligent agents and real - person services. The "AI Assistant" has added video call functionality, enhancing multi - modal input, rich media output, one - stop workbench, and in - depth search capabilities. The intelligent creation ability has been upgraded. A three - minute creative video can be generated with just one sentence, and it supports shot editing and customization of picture content.

When this search box extends beyond just "knowing" and expands into all industries, presenting a variety of rich media content, is its positioning still just a search? What is the new super - entrance in this era?

Perhaps, the narrative of the new search has already begun.

01

Is a larger search box still a search?

From the most intuitive visual presentation, the search box of Baidu has really become larger.

As the search box gets larger, the "content" it can carry must also be more. This is a leap for Baidu Search from being a tool - oriented service to a content - oriented one, a transformation from efficient information retrieval to complex task delivery, and from a search engine to an AI ecosystem.

When you open the new Baidu Search, in addition to the larger search box, you can also see several prompt word tab navigations such as AI writing, AI drawing, AI travel planning, and AI problem - solving. The relevant prompt examples are also very diverse, such as "What are the common types of fingernails?", "A DIY tutorial for best - friend bracelets", "Help me write some elegant five - character couplets". Even ordinary people who don't know how to write prompts can quickly start using AI search.

It should be noted that the search results, also known as the "Baikan" function, cover multiple modalities. Baidu calls it "Baikan". In simple terms, it presents the most useful content in the most suitable multi - modal rich media form, which can be concise and efficient text or videos with more information.

Notably, in the past, searches were often for definite needs. People clearly issued instructions and then retrieved the answers to their questions. However, in the AI era, most needs are vague and uncertain. The new search box can better meet these needs.

More specifically, the AI generative camera, combined with Baidu's multi - modal AI big model capabilities, can understand users' search needs through pictures without text input and then provide structured content answers. Common examples include travel guides and solutions to daily electrical appliance failures. According to Baidu, it will also provide intelligent Q&A based on more specific scenarios in the future and offer more personalized content answers based on user preferences.

According to the disclosure, Baidu Search currently supports the recognition of multiple dialects, including Cantonese, Sichuanese, and Henanese, accurately understanding users' intentions and saving users' typing time.

At the release event, Baidu also launched its self - developed video generation model, Muse Steamer, and a video product called Huixiang. Video generation has become a "battleground" this year. Since Sora under OpenAI made its debut at the beginning of last year, domestic large - scale enterprises and startups have all bet on video generation models.

The advantage of Baidu's self - developed video generation model lies in its delivery speed and quality. It can support the generation of 1080P high - definition long videos in 10 seconds with a movie - level aesthetic and at a lower cost. According to Liu Lin, the general manager of business R & D in Baidu's commercial system, the Turbo version, as an all - around model, only takes 2 minutes to generate a 5 - second video, supports 720P resolution, and can cover most creative scenarios.

Since this year, AI video content has exploded on multiple platforms. For example, AI - generated animal sports games and AI stress - relief videos have become popular formulas. According to the "China Internet Audio - Visual Development Research Report (2025)" released in March this year, the rapid development of generative artificial intelligence is reshaping the production mode of Internet audio - visual content. The proportion of users using AI tools for picture and video production has increased within half a year, rising from 25.6% to 31%.

02

The new super - entrance is here

Whether in the era of PC Internet or mobile Internet, search has been the well - deserved super - entrance. However, in the AI era, the status of this super - entrance has begun to waver.

In the past two years, emerging big model companies have all aimed to create a super - App, touting it as the new super - entrance. First, Kimi aggressively invested in advertising to grab users. Then, DeepSeek became popular with its model capabilities and rushed to the top of the popular App list. According to QuestMobile data, as of February 2025, the monthly active users (MAU) of AI - native Apps reached 240 million.

Although the user scale of AI - native applications has been continuously increasing, no product with a daily active user count exceeding 100 million, which can be called a super - App, has emerged. When discussing practicality, what should a super - entrance in the AI era look like?

Zhu Xiaohu, a partner at GSR Ventures, once said that, simply put, a high - quality model combined with a stronger traffic platform is more likely to become the super - entrance in this era.

To some extent, Baidu indeed has both of these capabilities. Since the PC Internet era, as of the first quarter of 2025, the monthly active users of the Baidu App have reached 724 million, with a relatively large - scale active user base, providing a continuous flow of traffic.

On the other hand, Baidu itself is a company with a strong technological foundation. It was one of the first in China to launch a base big model and has maintained a relatively high - frequency model iteration. In February this year, Baidu also took the lead in supporting the access to DeepSeek. In essence, it is providing users with better and more stable model capabilities through an open and cooperative attitude.

Moreover, to solve the problem of the usability of the base big model for ordinary people, Baidu has strongly supported the MCP ecosystem and connected truly usable MCP services.

Robin Li, the founder of Baidu, once emphasized at the Create 2025 Developer Conference that MCP can be regarded as a "universal socket" connecting big models with the real world, enabling AI to directly call external tools and data sources and achieving a new leap from dialogue - based Q&A to task execution.

The user experience has also made a leap. In the past, people could only conduct simple Q&A and obtain answer information through search. However, through MCP, users can directly query inventory in the database, access e - commerce platforms to read reviews, and even call the payment interface to complete an order.

The existence of MCP allows more small and medium - sized developers to better utilize industry data and tools when developing Agents and then meet the diverse needs of the public through Agents in specific fields.

According to official disclosure, the number of MCPs included in the Baidu platform has exceeded 18,000, covering multiple scenarios such as daily life, finance, and e - commerce.

This also gives Baidu Search a stronger content ecosystem during this upgrade, supporting its content output updates. To some extent, the search is not only the super - entrance for Baidu's AI but also the super - exit for its technology.

03

The awareness of the giants

At the end of the last century, two companies forever changed the way people obtain information. One is Google, and the other is Baidu. 25 years later, these two companies have almost made the same choice in unison: to embrace AI vigorously and let AI enter every aspect of people's lives, improving efficiency and quality.

The bomb - like announcements at the Google I/O Conference in May still leave a deep impression on many people. In essence, it is to turn AI search into an intelligent assistant beside people. Through multi - modal input methods, structured answers can be obtained. This has led the link lists in the traditional search era to directly enter the stage of intelligent and service - oriented solution delivery.

This shows the foresight of the search giants. After all, when generative AI had a major breakthrough, the outside world pointed the finger at search engines, believing that they should sound the alarm. The top management of Google was the first to feel the pressure and then regarded the release of ChatGPT as the company's top - level warning. After that, Gemini caught up quickly. Relying on its years of ecological advantages and search capabilities, it soon surrounded OpenAI. The outside world generally believes that this year's Google I/O Conference was a perfect comeback.

Of course, this is not just the awareness of the giants. Whether it is Google or Baidu, the essence of the transformation is to follow the footsteps of users.

It is understood that this biggest overhaul in a decade is an active transformation and positive exploration made by Baidu Search in response to the industry situation. From generalization to personalization, from tool - orientation to content - orientation, and from information acquisition to task completion, Baidu Search is striving to expand the boundaries of its search capabilities.

Zhao Shiqi, the vice - president of Baidu and the general manager of Baidu Search, also told us at the communication meeting on that day that the market environment is undergoing various changes today. "The users' needs themselves are changing, and the competitive environment has also changed a lot. Moreover, Baidu Search is not only competing with other search engines but also with other information - providing products."

He believes that as users' needs grow, first, the product must be transformed. "We must not view it with the old product definition of a search engine. So, internally, we always say to forget about the old - fashioned search form. Therefore, there need to be significant changes in the overall product form of the search."

Second, at the technical level, AI is not just linked to search. "Many companies doing AI today don't do search. Among those that do both AI and search, Google is definitely one, and Baidu is another. There aren't many such companies." To this day, Baidu is still deeply exploring the combination of AI models and search models. To some extent, this is an absolute advantage under Baidu's proactive approach.

Moreover, in the AI era, people need an ecosystem more. Baidu has the ability and motivation to build such an ecosystem. This is also its unique feature that differentiates it from other companies and products. It can be said that it has a similar effect to Google's overtaking of OpenAI on a curve.

At the beginning of this year, as DeepSeek quickly became popular, some startup companies gave up pre - training of big models. However, large - scale enterprises including Alibaba, ByteDance, and Baidu have once again taken on the responsibility of exploring the upper limit of AI capabilities and continuously invested in the R & D and training of base big models.

However, this is not enough. Although the capabilities of base models are much stronger than when ChatGPT was first released in 2022, many ordinary people still haven't experienced the advantages of base models. A person in the industry told 36Kr, "Since the 'next ChatGPT' hasn't emerged yet. Before ChatGPT appeared, GPT3 had been released for two years, but the outside world didn't have a strong perception of it." In his view, it is a good product that makes people quickly perceive the capabilities of the model, and there is still a chance for the next such product to emerge.

Compared with traditional information retrieval tools, an intelligent assistant that can understand users' intentions, provide proactive services, and assist users in completing various tasks is what users are looking forward to as the next super - product. The mobile Internet has been booming for more than a decade, but the next disruptive product has not yet emerged. People are still waiting for the one that can truly change the way users interact with the Internet.

This article is from the WeChat public account "Intelligent Emergence", and is published by 36Kr with authorization.