HomeArticle

Amazing! Zhipu has created the world's first general mobile phone Agent! It's free for everyone, and the APP can even directly control cloud computers.

量子位2025-08-20 15:31
Across apps, work while chatting.

Just now, the world's first universal mobile phone Agent has arrived!

Now, suppose you're in a meeting. You only need to say a single sentence to your phone, and it can "act on its own" to order takeout for you:

Help me find the nearest Luckin Coffee on Meituan Takeout and order a large iced Americano.

You can see that the moment AI receives the task, it starts to "swiftly" execute it.

It directly takes over your phone. There's no need to jump between various apps to get the takeout ordered.

Well, the most intuitive feeling is: it's convenient and intelligent.

So, what exactly is this Agent?

It's the world's first universal mobile phone Agent just released by Zhipu. It's like bringing the capabilities of Manus to your phone.

Here's the key point: It's free and accessible to everyone!

Maybe some of you are thinking, aren't there already Agents that can make phones act on their own?

Not quite. This time, it's really different.

Because with Zhipu's Agent, all tasks are executed in the cloud. It's like equipping your device with a cloud phone or cloud computer. Not only does it execute tasks smoothly, but it also doesn't affect your use of other apps!

More importantly, this is also the world's first mass - consumer - grade Agent. It can control not only mobile phones (both Android and iOS) but also cloud computers to do work for you!

Perhaps this is the best time for you to truly and freely experience an Agent.

What else can it do?

Let's briefly introduce how to operate AutoGLM.

After entering the app, you can see two major categories of tasks to choose from: one is "Life Assistant," and the other is "Office Assistant."

Take the Life Assistant as an example. When you click in, it's a normal window for chatting with AI. But first, you need to click on the "Phone" in the top - right corner:

Then, click on "Take over the phone" at the bottom, and you'll enter the "Cloud Smartphone" interface we mentioned earlier:

Here, you can operate it like a normal phone, enter the apps that the task might require, log in, and set up your own accounts.

After setting it up, let it go auto. This time, let's give it a more complex task:

I want to buy a thermos cup for around 200 yuan. Help me compare prices on Taobao, JD.com, and Pinduoduo.

You can see that for a cumbersome task like "comparing prices among different platforms," AutoGLM can "swiftly" and accurately execute it across different apps on its own.

All we need to do is "initiate the task → wait for the result."

Besides these functions useful in daily life, AutoGLM is also very good at automating tasks in work and study scenarios.

Unlike the current Agents on PC web - pages, AutoGLM directly invokes a cloud computer on your phone to do the work!

First, switch to the "Office Assistant" mode. The interface looks like this:

You can see that above the input box, there are convenient entrances to functions like "AI Video," "AI PPT," and "AI Webpage."

This time, let's start small and ask AutoGLM to generate a research report on Agents:

Help me generate a research report on Agents.

Similarly, without doing anything, we can watch AutoGLM use the "cloud computer" to collect and organize data. After waiting for a few minutes, a report of thousands of words based on nearly 100 reference sources will be ready:

Furthermore, we can ask AutoGLM to turn the written result into a PPT:

Turn this report into a beautiful PPT.

It has to be said that a task that used to take us at least a day now only takes a few minutes with AutoGLM.

How does it work?

From the above tests, it's easy to see that compared with traditional chatbots that only "tell you how to do things," AutoGLM has evolved to "do things for you directly."

Most importantly, it hardly occupies any local resources.

This is the key upgrade of AutoGLM this time - each user is provided with a cloud phone and a cloud computer, similar to a cloud backup device (with a bunch of apps pre - installed).

With this, users can directly mobilize AutoGLM to execute various tasks without installing any apps or making additional connections. Moreover, when AutoGLM is working, it doesn't affect the user's normal use of their own device; the two don't interfere with each other.

Even better, some apps that are not frequently used but have to be installed can be directly placed in the cloud backup device, thus freeing up more local storage and making the device run more smoothly.

In short, the smooth operation of AutoGLM on devices like mobile phones and PCs is truly due to the underlying design of cloud - based execution.

From a broader perspective, "cloud - based execution" not only precisely addresses the industry's pain points but also conforms to a rising trend.

Since this year, the popularity of Agents has been obvious to all, but when it comes to implementation, everyone starts to have a headache:

First, the computing power of local devices is limited. Ordinary mobile phones and computers simply cannot support Agent tasks that require high concurrency and high computing power. In other words, they can handle simple tasks occasionally, but they're likely to "crash" when faced with complex tasks.

Second, even when performing simple tasks, Agents continuously occupy the local CPU, memory, and even operation permissions during operation, seriously affecting the user's normal use of their device and resulting in a poor experience.

And "cloud - based execution" is just the right solution - it neither occupies local resources nor interferes with the user's operation of the real device.

For this reason, we can already see more and more industry players starting to layout cloud - based Agents.

For example, among the big Internet companies, Alibaba Cloud launched the "Super Brain" - Wuying AgentBay, specifically designed for intelligent agents, at the World Artificial Intelligence Conference forum, which executes various tasks in the form of a cloud computer.

In addition, cloud providers like PPIO have also launched products such as "Agent Sandbox" to provide a dedicated cloud - based operating environment for Agents.

These actions indicate that the industry has recognized the importance of cloud - based execution for Agent development and is actively investing resources in layout.

Zhipu's AutoGLM, launched this time, stands out from the basic Agents that can only handle simple tasks, thanks to this design, and is truly integrated into ordinary people's work and life.

Everything can be AutoGLM

Meanwhile, AutoGLM is not limited to mobile phones and computers. It can also be integrated into more carriers -

Such as smart speakers, in - car systems, and even plush toys. The idea is "everything can be AutoGLM."

To promote its wide application, Zhipu has also launched a mobile API application channel and the "AutoGLM Developer Ecosystem Co - construction Plan" as of today, empowering more developers' intelligent products with the capabilities of AutoGLM through open APIs.

Obviously, Zhipu has its own rhythm and long - term considerations in the layout of AutoGLM.

From the day it was founded, the company has pursued general artificial intelligence (AGI) as its goal and later put forward the vision of "making machines think like humans."

Around this goal, Zhipu has planned an AGI roadmap from L1 to L5: from pre - trained large models, to alignment reasoning, self - learning, self - awareness, and finally to conscious intelligence, advancing step by step.

AutoGLM is a crucial step for Zhipu towards L3 "autonomous learning intelligent agents." By bringing Agent capabilities to a wider range of ordinary users, it not only verifies the feasibility of the current technology but also accumulates experience and feedback in real - world applications to promote the model's self - learning.

This self - learning ability enables machines to break through the limitations of simply relying on historical data to acquire knowledge. They can discover new knowledge, summarize new methods in continuous interaction with users and the environment, and in turn, improve their own capabilities, forming a positive feedback loop between technology and application.

Once this loop keeps running, it will naturally further consolidate Zhipu's leading position in the Agent field.

Moreover, there's a relatively new change this time. Similar to GPT - 5, AutoGLM has also achieved a "unification of capabilities."

Relying on Zhipu's latest open - source SOTA language model GLM - 4.5 and visual reasoning model GLM - 4.5V (a purely domestic Agent), it integrates capabilities such as reasoning, non - reasoning, coding, research, Agentic, and GUI Agent into one model for the first time.

This also represents Zhipu's early understanding of AGI:

A model with comprehensive general multi - modal and thinking abilities is an important milestone towards AGI. AutoGLM is another stage - by - stage exploration result of Zhipu's pursuit of AGI.

From an industry perspective, the more important significance of AutoGLM may be that it verifies the feasibility and reliability of the "cloud - based execution" approach with a real product.

However, it has to be said that while AutoGLM provides a new solution for the industry, it also adds more heat to the already highly competitive Agent track.

At this stage of Agent development, it's no longer just about being able to complete tasks. The key is whether an Agent can upgrade from a simple executor to an "all - around player" capable of handling more complex scenarios and dealing with uncertainties more steadily.

Of course, putting aside the fierce competition among manufacturers, for ordinary users, AutoGLM is truly changing the way we interact with machines -

The large - scale model in our hands is no longer just "able to chat." It can directly operate the system and truly help us complete tasks.

Furthermore, Zhipu has also proposed the 3A principles that should always be pursued when moving from Agents to AGI:

Around - the - clock: On standby 24/7 and continuously executing tasks. It can still run and produce results when the user is sleeping, away, or the device's screen is off.

Autonomy without interference: The Agent operates on the cloud device without occupying the user's screen or computing power.

Affinity: It goes beyond the browser dialog box and connects to various devices and services such as mobile phones, computers, watches, glasses, PINs, and home appliances, covering both the digital and physical worlds.

It's foreseeable that with the continuous