HomeArticle

The second-generation Doubao AI phone is confirmed. It doesn't just tap the screen for you; it can also collaborate with Agents.

雷科技2026-06-23 08:45
A mature AI phone should be more "restrained".

In recent days, there has been new news about the second - generation Doubao AI phone. The industry media Xinliu Think Tank reported exclusively that ZTE Nubia has comprehensively scaled back other phone lines, concentrated its core resources on the second - generation Doubao AI phone, and plans to release it within the month.

It's just a few days away.

Although this statement has not been officially confirmed by ZTE, Nubia, or ByteDance, judging from the information made public in the past six months, Nubia and ByteDance have made great efforts to promote the new - generation Doubao AI phone. At the MWC at the end of February this year, Ni Fei, the president of Nubia, actually previewed this second - generation Doubao AI phone that "defines a new species of mobile phones".

Image source: Weibo

Three or four months earlier, the first - generation Doubao phone appeared in the form of the Nubia M153. Although it was still an "engineering prototype" equipped with the technical preview version of the Doubao phone assistant, the actual completion of the product was already quite high.

What's more intriguing, and also the real point that triggered discussions about the Doubao phone: users can let it perform cross - application operations through natural language, such as price comparison, photo editing, ticket checking, placing orders, sending messages. Even in some scenarios, it can be like a real person holding a phone, clicking on apps step by step, recognizing interfaces, and completing tasks.

The first - generation Doubao phone actually advanced the "AI phone" concept that mobile phone manufacturers have been talking about for the past few years to the step of "AI operating the phone for you".

However, when AI no longer just answers questions, but instead clicks on the screen, invokes applications, accesses the photo album, handles payments and social relationships on behalf of the user based on the GUI (Graphical User Interface), it inevitably encounters issues of permissions and privacy, and also impacts today's Internet business ecosystem.

The first - generation Doubao phone quickly hit this wall. WeChat, Alipay, banks, shopping platforms... all instinctively became nervous and imposed restrictions on its invocations and operations. The outside world also began to discuss system - level permissions, simulated input, account security, and privacy boundaries on a larger scale.

This is a question that the second - generation Doubao AI phone must answer: it cannot just be faster, more expensive, and more like a flagship than the first - generation. It also needs to solve the privacy problem and transform from an "engineering prototype" into a mass - produced phone that ordinary people can use with confidence.

Hardware Upgrade, Making Room for the Agent

Currently, there is not much hardware information about the second - generation Doubao AI phone. The relatively clear news is that it is expected to be equipped with the fifth - generation Snapdragon 8 Extreme Edition. Considering that the first - generation M153 already used the Snapdragon 8 Extreme Edition, 16GB + 512GB, a 6.78 - inch LTPO screen, and a 6000mAh battery, it is not surprising that the second - generation continues to use the flagship platform.

The first - generation Doubao AI phone, image source: Lei Technology

From the perspective of traditional mobile phone logic, these parameters are actually not very fresh. In 2026, which Android flagship doesn't have a flagship chip and a large battery? However, the real change in the second - generation Doubao AI phone should be that the hardware makes new trade - offs around the Agent.

In the past, the center of mobile phone hardware services was the App. The chip needed to ensure fast app startup, the screen needed to ensure good display, the imaging needed to ensure strong photography, and the battery needed to ensure one - day usage. After the addition of the AI Agent, the mobile phone will have an additional type of continuously running tasks:

It needs to understand user instructions, be able to recognize screen content, and can call the camera, microphone, positioning, photo album, calendar, notifications, and application status at any time. It needs to make judgments between the cloud model and the edge - side model, and also try not to slow down the system, significantly increase heat generation, and power consumption.

This means that the second - generation Doubao AI phone needs not only a stronger SoC, but also a complete set of system engineering around edge - side AI.

In addition, according to Qualcomm's description of this platform, in addition to the continued improvement of the CPU, GPU, and NPU performance of the fifth - generation Snapdragon 8 Extreme Edition, the core upgrades of this generation also include edge - side learning, real - time perception, personal knowledge graph, and Agentic AI capabilities.

If the second - generation Doubao AI phone is equipped with this chip, it should make the best use of the edge - side capabilities. For example, it can process a part of personal memories, preferences, frequently - used contacts, and frequently - used task processes on the edge side. When the user says "Help me book a ticket to Guangzhou tomorrow", it shouldn't start from scratch to ask about preferences every time, but should know what kind of seats the user usually takes, which travel app the user usually uses, what the invoice title is, and whether the user prefers to set off in the morning.

Image source: Qualcomm

The more sufficient the edge - side memory is, the more the AI is like an assistant that truly understands the user's habits.

For another example, multi - modal understanding should also be carried out more on the edge side. When the user asks "Is this reliable?", "Help me summarize this", or "Send the address here to him" on any interface, the AI needs to quickly understand the screen content. Uploading screenshots to the cloud every time will put pressure on speed, privacy, and stability.

A stronger NPU, memory, and local model can enable these lightweight tasks to be completed directly on the phone.

There is also an easily overlooked aspect: heat dissipation and battery life. The high load of traditional flagship mobile phones mainly comes from games and imaging, which users can perceive and usually have a clear duration. However, the high load of the Agent may be more fragmented and frequent. It may not run at full performance every time, but may wait, monitor, recognize, summarize, and retrieve in the background all day long.

Therefore, it is very likely that the second - generation product will continue to use a large - capacity battery, and there may also be improvements in heat dissipation, memory, storage, and system scheduling. It can even be further speculated that its hardware design will be strengthened around several AI entry points: an independent AI key, a higher - quality microphone, more stable voice wake - up, stronger screen content recognition, better privacy prompts, and a body design more suitable for long - time holding and voice interaction.

From the First Generation to the Second Generation, from "Operation" to "Collaboration"

More importantly, it's about AI. By now, it can almost be concluded that there will be a significant change in the "agent" path of the second - generation Doubao AI phone, because the external environment has completely changed.

Image source: OpenClaw

In the past six months, heavy - weight products such as OpenClaw, Claude Code, and Codex have brought about a very important change in the Agent ecosystem. That is, Internet platforms are accelerating their embrace of Agents, and realizing Agent interaction through MCP, A2A protocols, or official Skills.

MCP solves the problem of how AI connects tools and data sources. It transforms the past customized interfaces into a more general connection method. For developers, AI doesn't need to write a set of invocation logic for each service separately; for service providers, it can also expose its capabilities in a more standardized way.

A2A solves the problem of how agents communicate with each other. The mobile phone system assistant can be an Agent, and there can also be Agents behind WeChat, Alipay, Feishu, and Taobao.

The system assistant doesn't necessarily have to click on the WeChat interface like a human being. Instead, it can send a clear request to WeChat's Agent: send a message to a certain contact or initiate a video call. Then WeChat will execute within its own security boundary and return the result to the mobile phone assistant.

It may sound like just a change in the technical route, but it is very crucial for AI mobile phones. The first - generation Doubao phone tried to "operate apps for users", but the GUI - based Agent technical route has a great impact on the existing ecosystem. In contrast, the protocol - based Agent technical route is becoming more and more viable.

WeChat's recent promotion of A2A assistant capabilities with multiple mobile phone manufacturers is a very clear signal. WeChat hasn't fully opened its ecosystem, but it has started to allow the mobile phone system assistant to call WeChat's capabilities in specific scenarios, such as sending messages and initiating audio and video calls. The whole process emphasizes double authorization and also emphasizes that WeChat executes and returns the results itself.

Image source: Weibo

Even Doubao has learned from Qianwen in the past six months. On the one hand, it connects its own e - commerce, payment, and other service capabilities. On the other hand, it also connects the services of third - party platforms. For example, today, the Doubao APP has launched a gray - scale test of one - click car - hailing in Beijing and Hangzhou. Caocao Chuxing is responsible for providing the car - hailing service. Users directly state their travel needs in the chat box, and the system automatically recognizes the location, number of people, and preferences, matches the route and price, and then confirms and places the order with one click.

Image source: Weibo

Therefore, it can be predicted that the second - generation Doubao AI phone may retain the GUI Agent, because a large number of long - tail apps cannot access the standard protocol immediately. However, when facing some high - risk services and powerful platforms, more protocol - based and authorized connections are needed.

If it can use A2A or a similar mechanism for invocation, it should not forcefully simulate clicks. For operations that must be simulated, there should also be clearer permission prompts, operation playback, key - step confirmation, and risk interception. This will make the second - generation Doubao phone seem less "wild" than the first - generation, but it is also closer to a phone that can really be sold to ordinary people.

A Mature AI Phone Should Be More "Restrained"

In the past two years, the mobile phone industry has talked a lot about AI. Many functions sound exciting, but they have actually brought little change to users. Therefore, the Doubao phone has strongly stimulated the mobile phone industry and accelerated the competition of AI phones into the deep water area of application ecosystems and operation permissions:

Mobile phone manufacturers are busy redefining the system assistant, Internet platforms are busy redefining the open boundaries, chip manufacturers need to continue to provide more powerful computing power and energy efficiency for edge - side agents, and developers also need to consider how their apps can be called, understood, and distributed by AI.

So, will the second - generation Doubao AI phone develop like this? We still can't be sure.

However, a truly mature AI phone should be more restrained in the interaction between humans and agents and between agents and devices: In most scenarios, it should allow users to operate less, but in key scenarios, users must clearly see what the AI is doing. It can help users fill out forms, compare prices, organize itineraries, edit photos, summarize documents, and initiate communication. However, when it comes to sensitive operations such as payment, sending messages, account login, and finance, there should be clear confirmation and traceable records.

On the other hand, as stated in a previous article by Lei Technology, an AI phone cannot regard the GUI Agent as the only answer, nor should it completely abandon the general - purpose advantages of the GUI Agent. After all, when facing many long - tail apps, developers cannot afford to adapt to Agent interaction immediately due to energy and cost considerations.

At the same time, an AI phone cannot only rely on the cloud model. The improvement of edge - side AI capabilities is also imperative. A series of capabilities of edge - side AI, such as low latency, less interference, the ability to remember preferences, and the ability to understand context, can ensure the daily experience.

If the second - generation Doubao AI phone can achieve all these, its significance will not only belong to Doubao and Nubia.

This article is from "Lei Technology" and is published by 36Kr with authorization.