HomeArticle

Is the next battleground for AI computing power already extending into space?

硅谷1012026-02-09 14:24
The land-grabbing campaign for "orbital computing power" has begun.

Have you ever wondered: Could the next-generation "computing power factory" not be on Earth at all? In the past few years, AI has turned data centers into new "energy monsters." Electricity, cooling, water usage, and location selection have all become key bottlenecks restricting the evolution of AI.

So, a seemingly science-fiction idea has suddenly been brought to the table: moving data centers to space. Building data centers in space might sound like a PPT pitch to deceive investors. But in reality, a land-grabbing movement for "orbital computing power" has already begun.

At the just-concluded Davos Forum, Elon Musk claimed that in the next two to three years, space will become the cheapest place to deploy AI data centers. Immediately afterwards, on February 2nd local time, SpaceX announced that it had acquired the artificial intelligence company xAI. Musk revealed that after the merger, one of the most important things for SpaceX would be to promote the deployment of space data centers.

In addition to Musk, other companies are also closely planning space data centers. Blue Origin, owned by Amazon founder Jeff Bezos, secretly assembled a development team more than a year ago to build dedicated satellites for orbital AI data centers; Google recently released a space data center plan called Suncatcher, aiming to send the first batch of "rack-level computing power" into orbit in 2027; NVIDIA just sent a satellite equipped with an H100 GPU into orbit through the startup Starcloud and completed the training of the Nano-GPT model in space for the first time, marking that the construction of space computing power has entered the practical verification stage.

So, today, the question for space data centers is no longer "whether to do it," but "who can do it first." Why are technology companies willing to endure extremely high launch costs to send servers into space? How exactly should data centers be built in the vacuum of high altitude? Can AI really become cheaper and more efficient when computing power leaves the Earth's surface?

01 Why Send Data Centers "to Space"?

To understand why data centers need to go to space, we first need to see how difficult it is for them on the ground. If you ask Silicon Valley tycoons what the ultimate bottleneck for AI evolution is, they probably won't say it's algorithms, nor talent, and not even chips. Instead, it's two of the most basic physical limitations: electricity and cooling.

In a previous episode about "the real bill of data centers," we carefully analyzed that although the power supply and cooling equipment together account for less than 10% of the total construction cost of a data center, they are the real "choke points" for data centers.

Ground data centers are essentially power-hungry monsters. Currently, the continuous power consumption of a super-large AI data center has increased from dozens of megawatts (MW) in the past to hundreds of megawatts, and is even approaching 1 gigawatt (GW). What does 1 gigawatt mean? If a system operates at a power of 1 gigawatt 24 hours a day, 365 days a year, it will generate approximately 8.8 terawatt-hours (TW·h) of electricity in a year, which is basically equivalent to the annual electricity consumption of a medium-sized city.

The problem brought by AI is not only the consumption of electricity, but also that all this electricity will eventually be converted into heat. Taking high-end GPUs like the H100 as an example, the power consumption of a single card is close to 700 watts. When thousands of such graphics cards form a cluster, cooling becomes a more expensive system project than the computing itself.

As the global demand for AI computing power increases exponentially, traditional air-cooling technology has difficulty meeting the cooling needs of high-density computing equipment, and liquid cooling has become a necessity. Data research shows that for every kilowatt-hour of electricity consumed by a large data center, 1 to 2 liters of fresh water are often needed for cooling. This means that a 100-megawatt AI data center may consume millions of liters of water every day. What's more troublesome is that as the power consumption of GPUs continues to rise, the efficiency improvement of cooling systems has significantly slowed down.

However, for AI to continue to develop, it still has to rely on large-scale energy consumption. AI giants are racking their brains to obtain electricity: acquiring and renovating power plants, building their own power grids, snapping up gas turbines, and researching nuclear energy... The ground has been involved in an AI energy war.

In this context, there is a need to find a place with more abundant and stable energy and more efficient cooling. The answer is space. Beyond the atmosphere, space has prepared three gifts for humanity that the ground can never provide:

The first gift is energy. On the ground, energy is a complex systematic problem involving power generation, transmission, energy storage, peak shaving, carbon emissions, land, and other aspects. Even the most ideal new energy system cannot avoid weather changes and seasonal fluctuations.

But in low Earth orbit in space, the logic of solar energy is completely different: there is no refraction from the atmosphere, no cloud cover, and no day-night cycle. As long as the solar panels are large enough, in theory, it is possible to obtain clean energy that is continuously available for 24 hours at almost zero cost.

Calculation data shows that in Earth's orbit, the utilization efficiency of solar energy is 8 to 10 times that on the ground. This means that energy has become a "continuous variable" for the first time, rather than an "intermittent resource," which is extremely crucial for the development of AI. Because for AI training and inference, the most important thing is not "cheap electricity," but a long-term, stable, and uninterrupted power input .

From a more macroscopic perspective, "solar energy" is just the tip of the iceberg of the space energy gold mine. The "solar energy" used in space today is essentially just a by-product of the sun's fusion reaction. The sun itself is a natural nuclear fusion reactor that has been operating stably for 4.5 billion years, and the energy it releases every second far exceeds the total energy required by the entire human society.

Now, in order to obtain energy, many investors are researching and manufacturing small-scale fusion reactions. Elon Musk said that this is completely unnecessary because there is already a free and never-extinguishing ultimate energy source hanging above our heads.

The second gift is cooling. On the ground, huge fans and expensive liquid cooling systems are needed, but the cooling in space follows completely different physical laws.

AI operations generate a lot of heat, while the background temperature in space is only 3 Kelvin (about -270°C). By simply turning the radiator away from the sun, efficient natural cooling can be achieved. In a vacuum environment, heat does not need to be "moved away," but can be released into deep space in the form of radiation. We can use large radiation cooling plates to directly dump waste heat into the universe. Ethan Xu, a former energy strategy manager at Microsoft, told us that this means the PUE (Power Usage Effectiveness) can be infinitely close to 1.

Ethan XU

Former Energy Strategy Manager at Microsoft, Former Research Director at Breakthrough Energy

The temperature in space is extremely low. In traditional data centers, nearly 4% of the electricity may be used for cooling the data center, rather than powering the computing power. Therefore, in space, if the environment with a temperature close to absolute zero can be well utilized, the waste heat generated by the data center can be directly discharged into deep space through radiation cooling. In this way, the power usage efficiency (PUE) of the data center can theoretically approach 1. That is to say, almost all of the electricity provided to the data center is used to power the computing power, rather than for cooling.

The third gift is extremely low latency. Light travels 30% faster in a vacuum than in an optical fiber. Through laser links, space data centers can bypass complex terrestrial networks and submarine cables, achieving true "global computing power in seconds." When computing power nodes are deployed in orbit, they no longer mean "far from the Earth," but may become relay nodes that are closer to users and faster in a specific network topology.

So space meets the three conditions of continuous energy, extreme cooling, and communication conditions approaching physical limits, which are also exactly the three most scarce things for AI computing power at present. However, such a seemingly perfect solution faces a huge entry ticket problem in reality: How can servers that are heavier than pianos and more fragile than porcelain be stuffed into rockets and accurately deployed into orbit? How exactly should space data centers be built?

02 How to Build Space Data Centers? Two Main Current Exploration Paths

Currently, global exploration has gradually converged into two mainstream paths: one is "in-orbit edge computing"; the other is "orbital cloud data centers." These two explorations address "current problems" and bet on "future scale" respectively. They solve problems at different levels and represent ambitions at different stages.

Regarding these two paths, Zhejiang University and Nanyang Technological University in Singapore recently jointly published the latest research in Nature, for the first time systematically proposing a complete technical framework. We also interviewed Dr. Ablimit Aili, the first author of the paper, to help us understand the differences between the two routes and how they are built.

  • Chapter 2.1 In-Orbit Edge Computing

First, let's look at the "in-orbit edge computing" model. An edge data center is not a complete "cloud." Its core logic is relatively simple: instead of transmitting all the data collected by satellites back to the ground, AI accelerators are sent directly to the satellites already in operation, allowing the data to be analyzed, screened, and compressed in space. It is suitable for some smaller-scale and more specialized scenarios.

Ablimit Aili

Distinguished Researcher at the Yangtze River Delta Smart Oasis Innovation Center of Zhejiang University

Edge data centers mainly consider single satellites or smaller satellite clusters. For example, these satellite clusters may provide remote sensing services or imaging services. To upgrade them, we add better computing power, such as AI accelerators, during the upgrade to enhance the special computing capabilities (such as image processing capabilities) of these satellites, thereby greatly reducing the amount of data that these satellites need to transmit to the ground station. This will first greatly reduce the service latency and indirectly reduce the amount of data that the ground data center needs to process.

A representative successful case is the cooperation between Starcloud and NVIDIA. In November last year, Starcloud successfully sent an NVIDIA H100 GPU into orbit. The Starcloud-1 satellite they launched was equipped with an H100-level GPU. The entire computing power system weighed only 60 kilograms, about the size of a small refrigerator.

The mission of this satellite was not to "demonstrate computing power," but to directly receive data from a synthetic aperture radar (SAR) satellite cluster, complete real-time processing in orbit, and then transmit the results back to Earth.

As of now, it has completed several important tasks in space: first, it successfully called Google's open-source model Gemma and sent a greeting of "Hi, Earthlings, how are you" to Earth, as if it were an extraterrestrial intelligent life; second, it used the complete works of Shakespeare to train the Nano-GPT model created by Andrej Karpathy, a founding member of OpenAI, enabling the model to express itself in Shakespearean English; third, it read sensor data in real-time, conducted real-time intelligence analysis, such as instantly identifying wildfire heat signals, and notified ground personnel in a timely manner.

The success of Starcloud-1 also means that computing power in space has for the first time stopped being just an "auxiliary system" and has started to directly participate in the computing itself. There is a very clear technical and business logic behind why "in-orbit edge computing" has become the first successful route for building space data centers.

First, the technical difficulty of in-orbit edge computing is relatively controllable. By "controllable," it doesn't mean that "sending a GPU into space" is easy. Instead, it is because what it does is an extension of existing technologies, rather than a system-level reconstruction:

1. At the hardware level, this route does not invent a new computing architecture. It still uses mature data center-level AI accelerators, but repackages them to adapt to the space environment.

2. At the system level, in-orbit edge computing does not pursue complex computing power scheduling and multi-node collaboration. One satellite corresponds to one type of specific task (such as remote sensing image processing, meteorology, disaster monitoring, military reconnaissance, etc.). Therefore, it is more like a "task-specific computing power device" rather than a distributed cloud system.

Since these tasks are highly defined, it means that algorithms, computing power scale, power consumption, and cooling can all be fully designed and verified before launch, rather than being "improvised" in orbit.