HomeArticle

Jensen Huang's Ambitious Vision for Physics AI: Transforming 5G Networks into Distributed AI Computers

物联网智库2026-03-20 21:33
Is the communication network being devoured by AI? From "transmitting bits" to "providing intelligence", and from "passive pipelines" to "active computing platforms", a new paradigm is taking shape.

In recent days, discussions about NVIDIA's GTC conference have been almost dominated by Jensen Huang's "token economics."

"The data center of the future is not a storage warehouse but a factory for producing intelligent tokens; and performance per watt is the only hard indicator in this race." With these words, Jensen Huang painted a new paradigm for future competition for enterprises.

From computing power cost to inference efficiency, from token price to AI business models, the market's attention is focused on a familiar question: How can we produce and consume "intelligence" more efficiently? However, if we shift our focus slightly from the cloud, we'll find another piece of news from NVIDIA that is relatively easy to overlook - On March 16th, NVIDIA announced a partnership with T-Mobile and Nokia to deploy physical AI applications on a distributed edge AI network, aiming to upgrade the wireless communication network into a high-performance edge AI computing platform.

Compared with the re - optimization of efficiency and cost in "token economics," this piece of news points to a more fundamental question: When AI is no longer just about generating content but needs to enter the real world and participate in every real - time decision, do the network and computing architecture on which we rely to run AI need to be rewritten?

Jensen Huang's answer to this question is straightforward: "The network is evolving into an AI infrastructure, enabling billions of devices - from visual AI agents to robots and autonomous vehicles - to see, hear, and act in real - time. By collaborating with T-Mobile and Nokia to transform the 5G network into a distributed AI computer, we are creating a scalable blueprint for the global edge AI infrastructure."

For a professional who has long been concerned about the Internet of Things and edge computing, this might be the more notable signal from this GTC!

Breaking the Key Bottleneck in the Large - scale Development of Physical AI

Previously, Jensen Huang introduced his predictions about the development stages of AI in multiple speeches. He said that after going through the stages of perception AI and generative AI, AI has now entered the agent AI stage, and the future will be the era of physical AI. If generative AI solves the problem of "understanding and generating information," then physical AI has to face a more complex proposition: Understand the world and act in it.

According to NVIDIA's definition, "Physical AI is a model that uses motor skills to understand and interact with the real world, usually carried by autonomous machines such as robots and autonomous vehicles." We know that large language models like GPT and Llama are amazing at generating human language and abstract concepts, but they have limited knowledge of the physical world and are restricted by its rules. Physical AI, however, can understand the spatial relationships and physical behaviors of the three - dimensional world we live in, thus expanding the current generative AI.

With physical AI, autonomous machines can perceive, understand, and perform complex operations in the real (physical) world. For example, autonomous vehicles can use sensors to perceive and understand the surrounding environment to make informed decisions in various environments (from open highways to urban landscapes), including but not limited to more accurately detecting pedestrians, responding to traffic or weather conditions, and automatically changing lanes. In industrial and logistics scenarios, autonomous mobile robots (AMRs) in warehouses can use direct feedback from on - board sensors to navigate in complex environments and avoid obstacles, including humans. Manipulators can adjust their grip and position according to the pose of objects on the conveyor belt for precise operations. In urban spaces, systems composed of numerous cameras and sensors are trying to understand and respond to environmental changes in real - time.

It is precisely in this transformation that the requirements of AI for underlying infrastructure have been completely changed - because once AI enters the physical world, latency, reliability, and real - time performance may change from "experience issues" to "life - and - death issues."

Many systems cannot tolerate high latency and cannot rely on the classic path of "uploading to the cloud first and then processing." As current industry practices show, scenarios such as autonomous driving, robotics, and smart cities all require millisecond - level response and highly reliable connectivity. The problem becomes clear: A key bottleneck in the large - scale development of physical AI is the "lack of low - latency, secure, and ubiquitous connectivity."

Under the traditional architecture, there are two solutions to this problem, but neither is ideal -

"All to the cloud": That is, terminal devices collect data, upload it to the cloud for processing, and then receive the results. The problem with this model is that the link is too long, and latency and stability are uncontrollable, making it almost unusable in critical scenarios.

"All on the device side": Stacking as much computing power as possible on the device itself. However, this also faces bottlenecks. The limitations of terminal devices in terms of power consumption, cost, and volume make it impossible to support the continuous operation of complex models. At the same time, the isolation of computing power on devices also makes it difficult to support the continuous iteration and unified scheduling of models.

It is precisely between these two paths that a new architecture is emerging, which is to "sink" the computing power from the cloud, but not completely onto the terminal, but into the "network." This is the core logic of the AI - RAN architecture promoted by NVIDIA, T-Mobile, and Nokia this time: Deploy AI inference capabilities at the network edge nodes close to the terminals, enabling physical AI systems to offload a large number of computing tasks from the device side to the nearest base stations or edge data centers.

The direct result of this change is - Developers no longer need to stack expensive computing power on each camera, robot, or terminal device. Instead, they can rely on the distributed computing resources on the network side to deploy more complex AI capabilities at a lower cost. Under this architecture, the communication network is no longer just "transmitting data," but becomes a computing platform that carries intelligence, thus supporting the implementation of AI applications on a scale of billions of devices.

Leading Developers Deploy Inference and Visual AI to the Edge

To transform the network into a distributed AI computing platform, it is necessary to provide ultra - low latency and spatio - temporal consistency for billions of terminals at the network edge. This is exactly the core ability of T-Mobile, NVIDIA's partner this time. Different from Wi - Fi with limited coverage and security, T-Mobile's 5G standalone network provides wide - area coverage and quality - of - service guarantees, enabling complex AI agents to operate at busy urban intersections, industrial facilities, and remote areas.

According to the official press release, T-Mobile is collaborating with NVIDIA - certified physical AI developers (including Fogsphere, LinkerVision, Levatas, Vaidio, and Siemens Energy) to demonstrate "how base stations and mobile switching centers can support distributed edge AI workloads" and make full use of the public 5G network connection. They will integrate NVIDIA's Metropolis Blueprint on this platform for video search and summarization (VSS) functions.

NVIDIA's latest version of VSS (3) Blueprint introduces multi - modal visual understanding and intelligent search functions and is provided in a modular architecture that can be reconfigured according to different environments ("from retail stores to warehouses"). NVIDIA says that there are 1.5 billion cameras globally, but less than 1% of video content is manually reviewed. The VSS (3) Blueprint can "break down complex natural language queries and search video clips to find specific events within five seconds" and "summarize long videos 100 times faster than manual review."

Currently, many leading developers are collaborating with NVIDIA and T-Mobile. Based on the NVIDIA Metropolis Blueprint for video search and summarization (VSS), they are integrating physical AI agents that can drive real - time actions into T-Mobile's distributed edge network. Pilot application scenarios include:

Smart city operations: LinkerVision, Inchor, and Voxelmaps are testing an integrated "urban operation agent" and digital twin based on computer vision. This system can perceive, simulate, and optimize traffic signal timing, with the goal of increasing the accident response speed in San Jose by five times.

Automated inspection of public (power) facilities: Levatas is using NVIDIA's computing power to conduct 5G network - based automated inspections of hundreds of thousands of miles of transmission lines to detect and quickly handle problems such as pole tilting, corrosion, and abnormal heating, with a speed increase of up to five times. The two parties are currently evaluating the AI - RAN infrastructure to further reduce costs, shorten fault recovery time, and accelerate the shift from reactive maintenance to predictive maintenance.

Vision - based facility management: Developers such as Vaidio are building facility management agents based on the VSS Blueprint for threat detection and fault prediction and triggering automated workflows to improve facility management efficiency.

Real - time industrial safety: Fogsphere provides a safety AI agent for SAIPEM for real - time detection and response to dangerous events in high - risk on - land, offshore, and drilling construction environments, such as workers being under suspended objects or hydrocarbon leaks.

How Does AI Reshape the Role of the Communication Network?

From a more macroscopic perspective, the changes mentioned above also mean that the role of the telecommunications industry itself is undergoing a fundamental transformation.

For a long time, the communication network has been regarded as a "connectivity infrastructure" - its core task is to efficiently transmit data between devices. In fact, the scale of this infrastructure is so large that it can be compared to the entire IT industry: The global telecommunications industry is worth nearly $2 trillion, and base stations are spread across cities and rural areas, making it one of the most widely distributed technological systems in human society. In the past, they carried information flows. Under the AI - RAN architecture, these nodes that were originally mainly responsible for "transmission" will be redefined as distributed computing nodes and become the infrastructure platform for AI to run at the edge.

The reshaping of the role of the entire communication network by AI has actually been happening quietly. Previously, in the article "Is LoRa Competing for the 'Right to Speak' in the New Development Cycle of the Internet of Things?", I mentioned that it is not accidental that the LPWAN camp represented by the LoRa Alliance has begun to emphasize concepts such as "physical AI" and "closed - loop action." In the past LPWAN competition landscape, whether it was NB - IoT, LTE - M, or satellite IoT, the technical narrative has long revolved around coverage capabilities, power consumption performance, and cost advantages. LoRaWAN was also widely known for its "low power consumption, low cost, flexible private network, and strong deployment flexibility." However, in the AI era, it is trying to redefine its role: not just a data connection protocol, but also the data entry point, action exit point of AI, and the communication nervous system of physical AI.

This trend will be more obvious in the future network architecture. The design concept of 6G is pointing towards "being born for AI," rather than just increasing the speed. In February 2026, the 3GPP SA2 #173 meeting ended in Goa, India. Its R20 architecture panorama report sent an important signal: The industry consensus has moved beyond the simple "connectivity pipeline" to a "native intelligent platform." Under this architecture, the core network element AIMF (AI management function) has changed the interaction mode between the terminal and the network: In the past, the core network was only responsible for bit transmission, while the R20 architecture starts to provide MaaS (Model as a Service). Through the gradient splitting mechanism, the terminal only needs to calculate the underlying gradients to protect privacy, while the core network undertakes the high - level gradient calculation. This means that the network computing power will directly participate in the training and optimization of the user - side large model, rather than just being a passive information - transmitting pipeline.

Looking at the overall situation, it is obvious that AI is engulfing the communication network, and the communication network is also reshaping itself. Whether it is edge computing, physical AI, or the future 6G native intelligent network, they all indicate the formation of a new paradigm: from "transmitting bits" to "providing intelligence," from a "passive pipeline" to an "active computing platform." Under this new paradigm, AI will not only be software but also an inherent attribute of the telecommunications network; the network will not only be infrastructure but also a real - time ecosystem that carries intelligence.

Now, we may really be at the starting point of an intelligent world where "intelligence is everywhere and can be easily accessed."

This article is from the WeChat official account "Internet of Things Think Tank" (ID: iot101). Author: Sophia. Republished by 36Kr with permission.