It's time for AI phones to enter a new stage.
Mobile phone manufacturers are facing two opportunities. One is the chance to be more deeply involved in reconstructing the AI interaction system, and the other is the opportunity to establish a more direct entry advantage for hardware.
Today (the 21st), Cook announced that John Ternus, the current senior vice president of hardware engineering, will assume the position of CEO of Apple in September this year. He has chosen a senior engineer and product expert who has grown within the Apple system for 25 years to lead Apple into the AI era.
This means that Cook believes Apple will need more innovations that "make products better, bolder, more beautiful, and more meaningful" in the future. And this official announcement also happens to be at a time when the emergence of Lobster has triggered reflections on smartphone innovation.
Although the recently released phones such as Huawei HUAWEI Pura 90, OPPO Find X9s Pro, and REDMI K90 Max still emphasize the enhancement of specific capabilities such as shooting and memory by AI, we believe that mobile phones need to start entering a new stage of AI-led systematic reconstruction from AI empowering specific capabilities.
The emergence of Lobster and Skill is the prerequisite for us to make this judgment. An interesting attempt is that the owner of Jinguyuan Dumpling Restaurant had a sudden idea a few days ago and vibe-coded a "Jinguyuan Dumpling Restaurant · SKILL", then launched it and promoted it through the restaurant's official WeChat account.
If someone installs this skill for their Lobster, they can talk to the Lobster to get detailed business information about Jinguyuan Dumpling Restaurant, such as opening hours, whether takeout is available, and the WiFi password. Now, the owner of Jinguyuan is collaborating with the Meituan team to explore the implementation plan for the queuing and number-taking function.
The owner of Jinguyuan's vision for the future is that after customers enter the restaurant, they no longer need to scan the QR code to place an order. Instead, they can directly ask their Lobster to recommend a suitable menu for this meal. In this concept, the familiar experience of "clicking" to place an order on the phone is replaced by a few chats with the Lobster. Even, the Lobster may know better than you what kind of dumplings and cold dishes you should have for this meal.
When the entire in-store dining experience formed by mobile phones and apps in the mobile Internet era has a new experience due to the emergence of Lobster. When people no longer rely on graphical interfaces and touch operations to complete tasks, and the end of interaction becomes a dialog box, the mobile phone, the core hardware product that has dominated human digital life for nearly two decades, will also change accordingly:
First, although facing challenges from AI hardware, the future mobile phone will still be a major social and entertainment center, providing entertainment capabilities such as audio, video, and games, and meeting users' online social needs, allowing users to chat, browse Douyin, play games, and watch dramas.
Second, the mobile phone is still a relatively secure hardware for storing user data and a task execution center with stronger comprehensive capabilities.
Third, in the face of the AI era, as the most complex sensor cluster around people, the mobile phone will add an important function as a perception center - perceiving and learning users' behavior habits, environment, intentions, etc., and providing personalized context information for AI.
Fourth, mobile phones need to adapt to new interaction methods and build an operating system more suitable for the Agent era. They can no longer use the "mindset of the App era" to handle tasks in the Agent era.
Lobster Brings New Exploration Directions
Mobile phone manufacturers' exploration of AI is not late. From the machine learning stage to the large model stage, and then after the emergence of Lobster, mobile phone manufacturers have been keeping up with technological progress and actively exploring the application of AI on mobile phones.
This exploration can be roughly divided into three categories:
The first category is the enhancement of specific capabilities. This is the earliest and most widely implemented AI attempt by mobile phone manufacturers.
For example, the capabilities we are already familiar with, such as one-click background clutter removal, automatic video editing, and generating record summaries, as well as the newly seen XMAGE intelligent shooting and AI one-click flash memory functions. The essence of this type of exploration is to use AI to improve single-point experience, but the moat is shallow, and it competes with independent AI products, so users are less likely to choose to use them.
The second category is to build an AI execution add-on for the Android graphical interface. The cooperation between Nubia and Doubao Mobile Assistant is the most radical AI add-on attempt. In addition, manufacturers such as Honor are also making similar attempts. This solution uses AI's visual recognition ability (VLA model) and underlying permission acquisition to imitate human operations on the phone, thereby solving the problem of APP isolation.
Doubao Mobile Assistant
However, this "add-on" approach will face a core problem: Forcing to break down the walls when third-party applications are reluctant to open deeply often leads to rapid and intense bans. The root cause of the conflict is still the competition for entry among Internet giants. Doubao Mobile Assistant wants to establish a new entry based on AI, but old entry products such as WeChat and Taobao do not want to become slaves to the new entry.
Currently, although there are rumors that the second-generation Doubao AI mobile phone will be released in the second quarter, and two top 5 mobile phone manufacturers are also in contact with it, this path of competing with Internet giants by force is not the best choice for the long-term development of mobile phone manufacturers. The add-on solution is more like a transitional product before entering the AI OS, and it is difficult to create a business model that matches the Agent.
If mobile phone manufacturers must embrace AI, then the third type of exploration - Mobile Claw, may be more in line with the development needs of mobile phone manufacturers. Both Xiaomi and Huawei are testing their own Mobile Claw products. Xiaomi's MiClaw is the first Mobile Lobster product launched by a mobile phone manufacturer. Huawei's Xiaoyi Claw also quickly achieved the out-of-the-box and multi-terminal collaborative Lobster-raising experience on mobile phones.
After the arrival of the first year of Lobster, the consensus on the Agentization of mobile phone interaction is accelerating. Lei Jun believes that Lobster may be a new form of AI OS for Xiaomi. Hu Baishan, the president of vivo, also said that Agent will comprehensively reconstruct the product interaction paradigm, and mobile phones will evolve from Smart Phones to Agent Phones.
Abandoning the constraints of the graphical interface and building a new AI OS from the perspective of interaction experience means that mobile phone manufacturers have the opportunity to no longer be restricted by the favor of Internet giants. Instead, they need to attract Internet giants to participate in the construction of a more open Agent system by reconstructing the interaction experience and rules.
The New Mobile Phone OS Has Four Features
This Agentized interaction on mobile phones is still in the early stage of exploration. We can try to summarize its key features from the recent actions of mobile phone manufacturers.
First, a more Agentized assistant. Mobile phone assistants such as Xiaomi's Super Xiaoai, Huawei's Xiaoyi, and Honor's YoYo are all strengthening their AI capabilities and may even gradually become Lobsterized. They are no longer simple voice controllers but intelligent agents with active planning capabilities.
Similarly, Apple is also making more efforts to improve Siri's AI capabilities. The announced news shows that this year's WWDC will focus on AI progress and new software and developer tools. We can also look forward to whether Apple's new CEO will bring more bold innovations in mobile phone AI experience when taking office in September.
Second, emphasis on the construction of personal knowledge base. The mobile phone is a carrier for personal data and memory. Manufacturers are constantly improving the mobile phone's memory ability, which is to accumulate personalized context for AI to understand and execute tasks.
Honor chooses the end-side Memory-in-Context route to build a bionic strategy of "long-term memory + short-term memory + instantaneous memory". Nothing's Essential Memory can extract important information from the content saved by users and automatically supplement personalized background information. Xiaomi's Super Xiaoai can collect screen content through the "Xiaoai Memory" function.
Third, emphasis on multi-modal perception ability. Multi-modal perception ability allows the mobile phone to support AI to listen to language, view the screen, and understand the world through the camera.
Honor's R & D team has imported a multi-modal model to make YOYO better understand the content on the screen and even the physical world in the camera. Vivo emphasizes that imaging is the core "eye" for AI to perceive the physical world, and aims to make the mobile phone a digital partner with perception ability.
Fourth, emphasis on the ability of the basic model. Xiaomi released three models, MiMo V2 Pro, MiMo V2 Omni, and MiMo V2 TTS, in March. Among them, MiMo V2 Pro has more than one trillion parameters and supports a 1 million token context window. On the one hand, Apple is collaborating with Google to build the next-generation Apple Foundation Models (AFM) based on the Gemini model and Google Cloud technology. On the other hand, it is also continuing to develop its own basic large model.
These four features correspond to execution, memory, perception, and thinking abilities respectively, which means that the new OS is no longer selling hardware and software services but selling a soil for cultivating personal Agents.
The New System Also Needs to Build Two Ecosystems
During the process of mobile phone Agentization, the hardware ecosystem and application ecosystem that constitute its experience are also changing accordingly.
In terms of the hardware ecosystem, the closer combination of hardware and assistants will jointly form an interactive experience for serving users. This means that mobile phones need to form a more closely connected hardware network with AI glasses, AI rings, AI necklaces, and home IoT devices to carry the demand communication, data perception, and flow required for Agentization.
Vivo is investing resources in three core HUB products: mobile phones, head-mounted displays, and robots, and based on this, popularizing the capabilities of "imaging + AI" - the mobile phone is the source of perception, and the end-side remembers each user's personalized features and habits; the MR head-mounted display is the training ground for spatial computing; and the home robot, as the ultimate form of intelligence, will gather perception and act on the physical world.
At the application ecosystem level, Apps may be replaced by more atomic Skills. Guan Haitao, the CMO of Honor, posted on Xiaohongshu, "The technological mainstay in the mobile era is Apps, while in the AI era, it is Agents." Apps are more fixed, seeking the greatest common divisor for task execution; Agents and Skills are more flexible and can meet personalized and detailed needs.
An interesting exploration we have seen is the Essential Apps launched by the mobile phone manufacturer Nothing. Users can quickly create applications such as today's outfit recommendations and home screen display widgets according to their usage habits and daily needs. Then users can also publish the Essential Apps they created on Nothing's Playground platform for other users to discover and download.
Each Essential App usually corresponds to a specific and clear task, which is itself a product form similar to Skill. In the future, more people may develop more diverse Skills based on mobile phones and vibe coding, and then distribute them in a Skill market similar to Playground, forming a new experience of application production, sharing, and use.
Nothing's judgment on this is that personal computing is entering a new stage - devices are starting to adapt to people, rather than people adapting to devices.
"Apps have required people to follow preset applications, menus, and operation processes for many years. If there is no ready-made function, people can only wait for others to develop it; if the function does not fully meet the needs, they can only make do with it. In the new world shaped by AI, this model is no longer reasonable."
This may also lead to the exploration of a new business model. Shifting from "download - use" to "demand trigger - Agent call - Skill execution" means adjusting the past business operation logic of APPs centered on traffic and proposing a new business operation logic based on the characteristics of Agents that are used and then left. What this model is now is not clear, but it is very likely to be a combination of free basic traffic and on-demand value-added services.
In 1987, John Sculley, the former CEO of Apple, conceived the "Knowledge Navigator", predicting an intelligent assistant that can talk to people and handle complex tasks. The smartphone is the carrier born to carry this assistant.
Nearly forty years later, with the implementation of various Lobster and Lobster-like products, this concept is closing the loop, and the interaction of mobile phones is becoming closer to human intuition.
At this turning point, mobile phone manufacturers are facing two opportunities: one is the chance to be more deeply involved in reconstructing the AI interaction system, and the other is to truly realize the interconnection of all things and give full play to the more direct entry advantages of hardware such as mobile phones and AI glasses. The intersection of the two may make the AI assistants of mobile phone manufacturers the key entry and capability distribution channels in the AI era.
Apple's strategic adjustment and the continuous iteration of Huawei, Xiaomi, OPPO, and vivo all point to this new development stage.
This article is from the WeChat official account "Narrowcast", author: Li Wei, published by 36Kr with authorization.