Is it feasible to integrate GPUs into base stations? The real controversy surrounding AI-RAN
In the past year, if you've been following the news in the telecommunications industry, it's hard to avoid the term "AI-RAN". The AI-RAN Alliance led by NVIDIA and SoftBank, the laboratory tests by T-Mobile in Seattle, and the AI call demonstration completed by Indosat in Indonesia — a series of developments seem to signal that GPUs are about to enter base stations on a large scale, and AI is moving from the "upper layers of the network" to the "wireless bottom layer".
However, if you have the chance to talk to friends in the telecommunications operators, you'll find that their attitudes are far from as excited as those on the press conference stage. There is excitement, but more of it is prudence, wait-and-see, and even a hint of imperceptible doubt: Do base stations really need a GPU? How should this account be calculated? Is AI-RAN really a technological revolution or just another "race" led by chip manufacturers?
The "Smart" and "Not-So-Smart" in Base Stations
To understand what AI-RAN aims to do, we need to go back to the old problem of the radio access network.
There has always been an unspoken truth in the telecommunications industry: In the entire mobile network, the radio access network (RAN) is the least "smart" part. The core network has long been virtualized and cloudified, and various open-source platforms and general-purpose servers are thriving. However, when it comes to the base station level, the situation is completely different — dedicated chips, closed interfaces, and customized hardware, like a black box where outsiders can't get in and the data inside can't get out.
This may sound like a technical problem, but ultimately it's an economic one. The number of base stations deployed is extremely large. A medium-sized operator may have hundreds of thousands of sites. At this scale, any cost fluctuation will be magnified to an astonishing figure. Although dedicated chips are not very flexible, their power consumption, cost, and stability have been optimized to the extreme over decades, making them the "optimal solution" acceptable to operators.
However, problems also arise. As AI begins to penetrate every corner of the network and operators hope to optimize coverage, increase capacity, and reduce energy consumption through intelligent means, RAN has become the most difficult part to handle. You can deploy AI servers freely in the core network and equip the operation and maintenance center with a bunch of GPUs for network optimization, but when it comes to the base station, the path ends.
The ambition of AI-RAN is to bridge this last mile. However, it needs to be clarified that the relationship between AI and RAN is actually two-way. The industry usually uses two terms to define this relationship: One is "AI for RAN", which means using AI to optimize the performance of the radio access network — channel estimation, beam management, load balancing, these are all typical scenarios where AI empowers the network; the other is "RAN for AI", which means turning the base station itself into a provider of AI computing power, making the sites all over the place the infrastructure for distributed inference. NVIDIA's logic is simple: Instead of letting AI hover outside the base station, it's better to invite it in and make the base station itself smarter.
It sounds wonderful. But do base stations really need to be that smart?
Who Will Pay for GPUs in Base Stations?
This requires some accounting.
Let's first look at the cost. A GPU suitable for base stations is not cheap, and its power consumption is also significant. If it is to be deployed in hundreds of thousands of base stations, it will be an astronomical capital expenditure. And operators are not having an easy time these days. The growth of traffic without an increase in revenue has been a common dilemma in the global telecommunications industry over the past decade. Against this background, it's not hard to imagine the difficulty of getting operators to pay for upgrading GPUs in every base station.
Alok Shah, Vice President of Strategy and Marketing in the Network Division of Samsung Electronics America, said, "The industry is carefully evaluating the total cost of ownership model and business cases related to introducing GPUs into base stations. So far, a full transition to GPU computing at the site level faces challenges in both capital expenditure and operating costs, but innovation in this field is very active. We may find that in the next few years, the deployment of some sites will be economically viable." In other words, the account hasn't been calculated clearly yet, and no one dares to make a hasty decision.
Then, let's look at the necessity. Do base stations really need GPUs? Or is the existing CPU sufficient? This is actually an overlooked question. The performance of new-generation x86 CPUs has improved significantly. Many AI inference tasks, especially lightweight ones with low requirements for latency, can be handled by CPUs. If CPUs can solve the problem, why spend extra money on a GPU?
An expert from a domestic telecommunications operator put it more bluntly: "If computing power must be placed in the base station and it must be a GPU, I have doubts. The cost is too high, and operators simply can't accept it. Moreover, locking the computing power in the base station actually limits the flexible scheduling of computing power. Computing power should be a dynamically adjustable resource pool that can be allocated among edge nodes, aggregation computer rooms, and central clouds, rather than being fixed at each site." However, there are also different views. Based on the demonstration data, if the idle computing power of a single base station is rented out at 70% of the cloud computing market price, 30% of the base station construction cost can be recovered within five years.
Behind these statements lies a more fundamental divergence: How should AI-RAN be deployed, through a "full-scale upgrade" or "introduction on demand"? As a chip manufacturer, NVIDIA of course hopes to spread GPUs as widely as possible. But operators need to consider which scenarios really require base station-level AI capabilities.
Which Scenarios "Absolutely Need AI"
If in the 5G era, AI was still a "bonus item", then in the 6G era, the situation may change. The telecommunications industry generally believes that the 6G network will face a fundamental challenge: The complexity of the network will reach a threshold that "exceeds human scale", and the human brain can no longer manage it in real-time. By then, AI will no longer be an option but a necessity.
This is not alarmist. In the 5G era, base stations already have large-scale MIMO antennas, and the parameter configuration of beamforming has become so complex that it requires algorithmic assistance. In the 6G era, with higher frequency bands, more antennas, and more complex services, it will be almost impossible to manage the network through manual scripts and preset strategies. In other words, the future network must be "self-intelligent" — capable of self-perception, self-decision-making, and self-optimization.
This turns the question around: It's no longer "Why should AI be placed in base stations?" but "Can base stations without AI still function properly?"
Specifically, several directions are already clear.
One is channel estimation. Wireless signals are affected by interference, fading, and occlusion during propagation in the air. Base stations need to estimate the channel state in real-time to decide what parameters to use for data transmission. Traditional algorithms have limitations, while AI can predict channel changes more accurately by learning historical data. A team under Fujitsu provided data showing that using AI to improve channel estimation can increase the uplink performance by 20%, and in some scenarios, it can even reach 50%.
Another is beam management. Large-scale MIMO base stations can generate multiple narrow beams to cover users in different directions. However, users are mobile, and the beams need to follow them. If the beam switching is not timely, the connection will drop. AI can predict the user's movement trajectory and switch the beam in advance to provide a smoother user experience.
There is also spectrum sharing. The traditional approach is to shut down the entire frequency band if it is affected by interference. However, AI can do it more precisely — identify the interference source and only block the affected frequencies while keeping the rest in use. An application demonstrated by MITRE in 2025 does exactly this.
The common feature of these scenarios is that they require real-time response, local decision-making, and it's impossible to send all the data back to the center for processing. This is the significance of base station-level AI.
Base Stations "Moonlighting at Night"
If "AI for RAN" aims to make the network run more smoothly, then "RAN for AI" is exploring another possibility: Can base stations not only be costly infrastructure but also become profitable assets?
SoftBank and Nokia recently conducted an interesting experiment. They built an AI-RAN platform based on NVIDIA GPUs in Japan. During the day, the computing power of the base stations is prioritized for 5G communication — handling users' voice, video, and data requests. At night, when the network traffic drops significantly, the originally idle GPU computing power doesn't idle but automatically switches to the "computing power provider" mode through SoftBank's AITRAS orchestrator to run AI inference tasks for third-party customers.
In other words, the same base station serves as a communication base station during the day and turns into an edge AI server at night. Pallavi Mahajan, Chief Technology and AI Officer at Nokia, commented on this, "As the global demand for AI processing accelerates, this project demonstrates how to use distributed network resources to provide scalable, efficient, and sustainable AI services."
Ryuji Wakikawa, Vice President of the Advanced Technology Research Institute at SoftBank, put it more directly, "In AI-RAN, it's very important to maximize the value of computing resources. We enhanced the AITRAS orchestrator to allocate resources to external AI workloads, thereby utilizing these resources as a new source of revenue."
This "base station moonlighting at night" model touches on a deeper transformation: Base stations are evolving from simple "cost centers" to potential "profit centers". Of course, this is just an experiment. The customer profile is not clear yet — should it be sold to Internet companies for edge inference or to industrial enterprises for machine vision? The business model is still being explored. But it at least opens up an imaginative space: If all the base stations in the city can contribute computing power at night, it will be a huge distributed computing network.
Walking on Two Legs: The Long-Term Coexistence of GPUs and CPUs
Back to the original question: Do base stations really need GPUs?
The most likely answer is: Some do, and some don't. Just like today's network devices, some use dedicated chips, some use general-purpose CPUs, and some use FPGAs, each meeting its own needs. In the future, base stations will not be of a single form but will flexibly choose the computing architecture based on scenarios and costs.
In urban hotspots and high-traffic areas, base stations may indeed need GPUs to support complex AI tasks. For remote areas and low-load sites, CPUs are sufficient, and there's no need to spend extra money. Another possibility is that GPUs are not deployed in every base station but at edge nodes to cover multiple base stations in an area, taking into account both computing power supply and cost control.
NVIDIA itself has also realized this. One of the core selling points of its AI Aerial platform is "resource sharing" — the same GPU can be dynamically allocated to RAN tasks and AI tasks, handling communication during busy times and inference during idle times, improving utilization and spreading the cost. This actually addresses the operators' concerns about cost: You don't have to buy a dedicated GPU for AI; it can share one with RAN.
The experiment conducted by SoftBank in Japan follows this idea. They used a single system to run both 5G and third-party AI applications, proving that the two can coexist without interference. For operators, this provides a new possibility: Base stations can no longer only be cost centers but also become output nodes for computing power services, creating new sources of revenue.
Of course, this is just an experiment. There's still a long way to go from the experiment to large-scale commercial use. How to set the standards, unify the interfaces, and design the business model are all unsolved problems. The answer from industry insiders is, "The customer profile is not clear yet. Should it be sold to Internet companies for edge inference or to industrial enterprises for machine vision? The business model is still being explored."
Conclusion
Looking back in 2026, the discussion about AI-RAN has shifted from "whether to do it" to "how to do it". NVIDIA has turned a concept into an industrial alliance in two years. From T-Mobile to SoftBank, from Nokia to Cisco, more and more big players are joining. This itself shows that the direction is correct.
However, "the right direction" doesn't mean "fast implementation". The telecommunications industry has its own rhythm. The networks of hundreds of millions of users can't be messed around with casually, and stability and reliability are always the top priorities. "Whether the AI-RAN architecture can achieve large-scale commercial use depends crucially on its performance, cost control, and operational stability. Various demonstrations at this year's MWC show that the relevant technological foundation is continuously maturing, and the integration of cloud computing, artificial intelligence, and communication infrastructure has gradually entered the stage of controllable verification and implementation from the conceptual stage."
"The entry of AI into base stations will not be a revolutionary overhaul but a gradual process of penetration and integration." This process may take five or ten years, but once completed, the network will no longer be the same as it is today.
By then, base stations will not only be signal-transmitting towers but also nodes for perceiving the world; the network will not only be a data-transmitting pipeline but also an infrastructure for carrying intelligence. And the starting point of all this is the controversial and yet-to-be-verified experiments and discussions today.
This article is from the WeChat official account "Semiconductor Industry Insights" (ID: ICViews), author: Fang Yuan, published by 36Kr with authorization.