HomeArticle

The rent has dropped sharply by 50% within 10 months. Why is Nvidia H100 not popular anymore? | Focus Analysis

邱晓芬2024-10-21 14:00
When the relationship between supply and demand becomes clearer, downstream AI manufacturers and computing power providers gradually view NVIDIA products with a more rational perspective.

Author | Qiu Xiaofen

Editor | Su Jianxun

Recently, the topic of the decline in NVIDIA card rental prices has sparked intense discussions in the AI industry. In a widely circulated article on the foreign network, the shocking expression "The NVIDIA GPU Rental Bubble Bursts" is used to describe this current trend.

36Kr has learned that it is indeed a fact that the rental prices of NVIDIA's core products in China have fluctuated significantly. The rental price trends of NVIDIA's popular chips in 2024 are as follows -

NVIDIA H100 is generally rented in the form of an 8-card node. The market quotation for a single card fluctuated between 120,000 - 180,000 yuan per month at the beginning of the year, and now it has dropped to 75,000;

The consumer-grade graphics card "NVIDIA 4090", which was once hyped up to 18,000 - 19,000 yuan during the "mining craze" and was in short supply. At the beginning of this year, the rental price of a single "NVIDIA 4090" was around 13,000 yuan, and the current rental price is approximately 7,000 - 8,000 yuan.

That is to say, the rental prices of these two popular NVIDIA chips have both dropped by 50% within 10 months, and they are no longer the highly sought-after items as they were in the previous two years.

However, many industry insiders say that it is not as alarmist as the foreign network articles suggest, and there is no need to panic. Some industry insiders have calculated that in the past, the rental price of conventional computing power chips has dropped by approximately 80% within five years - NVIDIA H100 and 4090 were released in 2022, which is two years ago, roughly in line with the objective law of price decline.

Of course, there are also other combined factors. The decline in the rental prices of NVIDIA's popular chips is essentially the result of the interaction of factors such as NVIDIA's product cycle and changes in the supply and demand in the computing power market.

In the face of the new changes in the market, the domestic computing power industry is also urgently making various adjustments.

The Balance of Computing Power Supply and Demand Tilts

The decline in NVIDIA chip rental prices is related to NVIDIA's current transition period between new and old products.

An industry insider said that compared to H100, NVIDIA's new GB200 product with the Blackwell architecture this year has a lower unit computing power cost. Most AI companies, considering cost reduction, basically choose to "wait for the new product", resulting in the old products being somewhat neglected.

In Huang Renxun's description, the new product is in a completely different situation - he claims that the Blackwell chip has a strong demand, resulting in the allocation being like "walking a tightrope", and he could "offend major customers" if he is not careful.

But even though it is highly anticipated, this new product is facing an embarrassing delay problem.

NVIDIA's engineers attribute the dilemma to TSMC's adoption of a new packaging technology; TSMC, on the other hand, accuses NVIDIA of frantically urging the production process and giving them a shorter verification time than usual. This has led to NVIDIA's new chip, which should have been launched to the market in the third quarter of this year, now being postponed to the fourth quarter or even next year.

An industry insider in the chip industry predicts to 36Kr that after the GB200 is officially launched, the decline in the rental prices of NVIDIA's old chips is likely to further intensify. He judges that "it is expected not to recover within the next half year".

In addition, the sharp decline in NVIDIA product rental prices is also related to the mismatch between the supply and demand in the current computing power market.

In China, the layout model of the computing power industry is the opposite of that in foreign countries - in China, the computing power pool is built first, and then AI applications are gradually developed, which is "looking for a hammer with a nail"; while the computing power industry in foreign countries is more commercialized and prefers to build a computing power center that matches the exact customers after finding them.

Industry statistics show that in the past two years, a total of 13,000 intelligent computing centers of various sizes have emerged in China. As of the first half of 2024, the computing power scale in China has reached the second largest in the world (246 EFLOPS), and the year-on-year growth rate of intelligent computing power exceeds 65%.

In this construction wave, there has also been a hoarding trend of NVIDIA H100 chips in China. When these chips entered China through various covert ways with Hong Kong, China and Singapore as transit points, the computing power industry pessimistically found that the demand for pre-training, which originally consumed the most computing power, has generally declined. (For details, click: "In the 'Six Tigers' of Large Models, At Least Two Are Giving Up on Large Models | Focus Analysis")

At the same time, since 2024, although the demand for inference and model fine-tuning has reversed and is showing a trend of surpassing pre-training, it has not reached the originally envisioned "explosive" situation. "At present, no super applications of AI or clear scenarios have been seen".

When the computing power generated by the hoarding of a large number of chips in the past two years cannot be consumed by a wide range of AI applications in a short period of time, the balance of supply and demand in the computing power industry tilts, and the price decline is within expectations.

From Buying Cards to Renting Cards

In the past, a common business model in the computing power industry was to sell NVIDIA "bare metal", commonly known as "selling iron" in the industry. But in the current situation of changes in the supply and demand of the computing power industry, the pure hardware-selling model is too simple and crude to be sustainable. Especially when the NVIDIA rental prices have "crashed" this year, the concept of downstream AI companies regarding computing power chips has also quietly changed.

If in the past, whoever could buy more NVIDIA chips meant that they had the greatest chance of training a more powerful model faster. Now, AI companies are more inclined to choose the way of renting chips to obtain computing power rather than directly buying chips as heavy assets and occupying cash flow.

For this reason, the upstream computing power industry has also made corresponding adjustments to adapt to this trend and tried to launch more diverse rental services.

An industry insider said that in the past, if AI manufacturers rented NVIDIA cards, they basically needed multiple nodes and rented by the year. But the change this year is that not only have the customers with computing power needs become more dispersed, but they have also become extremely sensitive to costs, and the demand for time-sharing rental has become very high.

"Now, some computing power centers also allow you to rent only a few NVIDIA cards at a time for only a few hours". This is a bit like, in the past, you needed to rent an entire floor or two by the year, but now you are allowed to rent a single room for a short period.

However, the direct consequence of this change is that the payback period of the computing power industry has become longer. Some industry insiders have roughly calculated to 36Kr that "for a computing power center built with H100 chips, the hardware payback period is as high as more than 5 years.

At the same time, practitioners in the computing power industry are trying to increase the granularity of computing power services, and there is a trend of gradually extending to the upper model layer and application layer.

36Kr has learned that some intelligent computing center operators, in addition to selling computing power, will also help downstream AI customers fine-tune their models as a side service;

Or directly delve into several industries with stronger computing power demands, such as finance, medicine, and new energy, and combine with specific scenarios to explore more potential demands for selling/renting computing power.

The aforementioned industry insider said that after their calculations, combined with various AI services, "the hardware cost payback period can be shortened to about 2 years at the shortest.

These adjustments are not bad things. When the AI industry and the computing power industry have experienced two years of rapid growth, the supply and demand relationship has become clearer now, and these two sides are looking at the NVIDIA chips that they regarded as treasures in the past two years with a more rational perspective.

end

end