Instead of using a Mac Mini, he ran OpenClaw on a $25 Android phone: it can turn on the lights and take photos.
What would you do with $25 (approximately 173 Chinese Yuan)?
Buy a takeaway meal, top up your phone credit, or casually order a Bluetooth headset? But in the eyes of an AI enthusiast developer in the United States (referred to as Ethan in this article), this $25 is enough to build a "physical world operable intelligent agent".
He did something that sounds a bit outrageous: He ran the recently popular OpenClaw on a prepaid Android phone that costs $25 - $30 at Walmart. He made it receive instructions via Discord and then directly control the phone's hardware - turn on the flashlight, take photos for recognition, read sensors, and even attempt to make a call.
What's even more interesting is that he's not satisfied with just one phone. Instead, he plans to set up a whole row of phones to create an Agent "phone cluster".
From Chatbots to "Actionable" Agents
Ethan's solution is actually not complicated. The core structure is as follows:
● Install Termux (a Linux-like terminal environment for Android) on the Android phone.
● Run the OpenClaw Agent in Termux.
● Call the Android system capabilities through the Termux API.
● Communicate with the Agent via Discord.
In other words, this $25 phone has become an always - online "hardware execution node". For example, he can send an instruction on Discord: "Hey Claw, turn on the flashlight and then turn it off." A few seconds later, the phone's flashlight turns on and then off.
The process behind this is not mysterious: OpenClaw receives the Discord message, calls the Termux API, and then the API calls the Android system interface to complete the hardware operation - things that were originally only possible for apps or system processes are now done by an agent driven by a language model.
In Ethan's view, what's really interesting is not "being able to turn on the flashlight", but "the model starting to have physical execution capabilities".
Photo + GPT 5.2: The Visual Ability of an Entry - Level Phone
To prove that this is not a "toy - level demo", he conducted a more concrete test.
He told the Agent: "Take a photo with the rear camera and then tell me what you see." Then, he pointed the phone at a Raspberry Pi on the table. The phone successfully took a photo, and the image was sent back via Discord. Subsequently, the photo was sent to the currently configured model, which uses GPT 5.2 for visual analysis.
In response, the model's description was: "A single - board computer, a Raspberry Pi, and the connected USB cable."
This task was accomplished: The low - end Android phone is responsible for image collection, the cloud - based large model is responsible for visual understanding, Discord is responsible for interaction, and the Agent is responsible for process orchestration - a complete "perception - understanding - feedback" loop has been successfully run on the $25 hardware.
Not Only Sensing the Phone's Posture but Also Making Calls
Moreover, Ethan also tested the sensor capabilities. He asked: "What's the current posture of the phone?"
The Agent called the accelerometer data, analyzed the direction of gravity, and finally replied: The phone is roughly in a vertical upright state - at that moment, he was indeed holding the phone vertically.
This shows that the Agent is no longer just a "text understanding system" but a system node capable of reading real - world physical states. For example, hardware such as the IMU, camera, and flashlight, which were originally used for apps, have now become part of the AI's toolbox.
Perhaps someone might ask, since it can call the camera and read sensors, can it make a call?
Theoretically, it can. Ethan asked the Agent to find "Mike" in the contact list and dial the number. The phone did bring up the dialing interface and attempted to initiate a call - however, since this is a prepaid phone without an actual bound number, the call naturally failed.
At this point, Ethan also added: "If you want OpenClaw to monitor the microphone audio or send voice, the phone needs root privileges. But my phone isn't rooted, so it can't be done because Android has very strict sandbox isolation for permissions related to calls and audio."
Future Vision: Building a "Phone Cluster"
In fact, many developers currently choose to run Agent clusters using Mac Minis or small servers. The advantages are strong hardware performance, stable deployment, and a controllable environment. In contrast, Ethan's decision to run OpenClaw on low - cost phones is quite surprising.
After the above demonstrations, Ethan said that although this $25 cheap phone has limited configuration, it is already excellent as an entry - level device for running OpenClaw: "For many developers who want to try OpenClaw but don't want to spend too much on hardware, these cheap prepaid phones are an excellent choice. They allow you to quickly get started and experience the fun of AI agents controlling hardware."
However, he also objectively admitted that if the budget allows, it's still better to run OpenClaw on a Raspberry Pi:
"The Raspberry Pi runs a native Linux system. You don't have to tinker with the OpenClaw configuration to bypass system restrictions like you do with an Android phone. It's more convenient to use and can avoid many compatibility issues."
As for future plans, Ethan revealed that his next step is to build a "phone cluster": "Many people now buy multiple Mac minis to build an OpenClaw cluster. I also want to try using several of these cheap Android phones to form a phone cluster. Each phone will run an OpenClaw agent, and then I'll interact with all the agents simultaneously via Discord to see what more interesting functions can be achieved."
Community Doubts: Is a Phone Cluster Really Useful?
After Ethan's video was posted, the comments in the comment section were quite divided.
Some people said straightforwardly: "It's cool, but I can't think of any practical uses for cluster - controlling phones." Others started to think outside the box:
● It can be made into an extremely low - cost security system: When it detects movement in the frame, it automatically records a 15 - second video and sends it to the owner via text message or email - theoretically, this logic is indeed feasible. Since the phone comes with a camera, network, and sensors, as long as the Agent can link the trigger conditions with the sending logic, it can become a distributed monitoring node.
● Some comments joked that if you insert SIM cards into all the phones, they can become a "social media like farm".
Among the many comments, there was also a more practical voice.
In the past, many people wanted to conduct similar experiments but were held back by the model cost. Calling the API of top - tier models requires a subscription fee; and open - source models that can run locally often require at least 40GB of memory. For ordinary developers with only 10 - 20GB of available memory, it's almost impossible to run them smoothly.
Now, the combination of cloud API + low - end hardware collection has become a compromise solution: Heavy computing is handled by the cloud, the large model is only responsible for understanding, and the phone is only responsible for perception and execution - this allows more "budget - conscious" developers to have the opportunity to participate.
So, what do you think of Ethan's experiment? Welcome to leave a comment in the comment section.
Reference link: https://www.reddit.com/r/AgentsOfAI/comments/1qybhk2/this_guy_installed_openclaw_on_a_25_phone_and/
This article is from the WeChat official account "CSDN". It was compiled by Zheng Liyuan and published by 36Kr with authorization.