HomeArticle

This time, Jensen Huang didn't talk about chips, but about where the money in AI is flowing.

AI深度研究员2026-03-17 08:54
The core viewpoints of Jensen Huang, CEO of NVIDIA, at the 2026 GTC Conference emphasized the significant turning point of the AI industry shifting from model training to inference applications.

Jensen Huang, the CEO of NVIDIA, emphasized at the 2026 GTC Conference the significant shift in the AI industry from model training to inference applications. The article points out that future data centers will evolve into factories producing Tokens, with their business logic directly driven by computational efficiency and output value. As the demand for inference grows exponentially, Tokens have become a commodity with tiered pricing and are deeply integrated into corporate budgets and productivity tools. This transformation has given rise to a complete industrial chain from cloud service providers to Agentic as a Service (GaaS), enabling AI to truly become a "digital employee" capable of performing complex tasks. Currently, industries such as finance, manufacturing, automotive, and robotics are achieving cost reduction and efficiency improvement by consuming a large number of Tokens, propelling the AI chip market towards a trillion - dollar scale. The full text reveals how AI computing power is converted into actual economic output, marking the official start of the commercialization closed - loop of generative AI.

March 16, 2026, San Jose, USA, GTC 2026 Conference.

Jensen Huang listed the following set of changes:

  • The number of inference service providers has increased by 100 times in the past year;
  • Cloud service providers account for 60% of NVIDIA's revenue;
  • SaaS companies are transforming into GaaS companies, shifting from selling tools to leasing agents;
  • Companies are starting to allocate Token budgets for engineers, equivalent to half of the basic salary;
  • Industries such as finance, manufacturing, automotive, and robotics are all using Tokens to reduce costs and improve efficiency.

His prediction is that by 2027, the AI chip market will have a business volume of at least $1 trillion. However, this $1 trillion is not just from selling chips but from the entire industrial chain.

The core of this industrial chain is the Token.

Future data centers will be factories that produce Tokens. This is not just a technical concept but a complete business model.

Section 1 | Why the Money Starts to Flow

The AI industry is undergoing a crucial turning point.

In the past two years, all the money was poured into one direction: training larger models. But now, the money is flowing towards another end: making the models actually work.

This turning point is not due to technological upgrades but the result of the successive detonation of three nodes.

The first node is ChatGPT.

After its launch at the end of 2022, AI for the first time changed from "being able to understand" to "being able to generate". It can not only recognize pictures and translate texts but also write articles, write code, and generate content from scratch.

This opened a door, but it was not enough.

The second node is the O1 and O3 models.

They brought inference capabilities. AI began to be able to think, plan, and break down complex problems into steps it can understand. Jensen Huang mentioned in his speech that inference makes generative AI trustworthy and rooted in truth.

This made AI change from "being able to do" to "being able to do correctly".

The third node is Claude Code.

This is the first real agent model. It can read files, write code, compile, test, evaluate, and then return for further iteration. For the first time, you no longer ask AI "what, where, and how" but directly tell it to create, execute, and develop.

AI has changed from a tool to an employee.

After these three steps, the computing demand exploded.

Jensen Huang presented a set of data in his speech:

To think, the number of Tokens consumed by AI has increased by 10,000 times.

The usage has increased by 100 times.

The total computing demand has increased by 1 million times.

Just look at the market.

OpenAI and Anthropic are now completely limited by computing power. The amount of computing power they can obtain determines the number of Tokens they can generate and how much their revenue can increase.

In the past two years, the financing scale of AI - native enterprises has also been unprecedented. The total investment has reached $150 billion, and the single - round financing has jumped directly from millions or tens of millions to hundreds of millions or even billions of dollars. These companies are either creating Tokens or adding value to existing Tokens, and Tokens require massive computing power.

When Jensen Huang said last year that the market would reach $500 billion in 2026, no one at the scene thought it was an exaggeration. Because everyone had experienced a record - breaking year.

This year, he directly raised the figure to $1 trillion.

Because AI can finally do real work.

Section 2 | Economics of the Token Factory

"Future data centers will be factories that produce Tokens."

Behind this statement is a complete new business model.

In the past, data centers mainly did two things: store data and run software. When building a data center, the core consideration was whether the capacity was large enough to store how much data.

Now it's different.

The logic of an AI factory is: the number of Tokens you can produce determines how much revenue you can generate.

Jensen Huang calculated an account. For a 1 - gigawatt data center, the power is fixed. You can't turn it into 2 gigawatts, which is a physical limitation. Land, power, and enclosures all have upper limits. So, under this power limitation, the number of Tokens your factory can produce directly determines how much money you can earn.

This is why he proposed a key indicator: Tokens per watt.

It sounds very technical, but put it another way, it's easy to understand: with the same electricity bill, how much product your factory can produce.

And Tokens are now starting to have tiered pricing.

The free - tier Tokens may not make money, but they can attract users.

The mid - tier Tokens cost $3 to $6 per million.

The high - tier Tokens, with larger models, faster speed, and longer context, can cost $45 per million.

The top - level research - type service model can charge $150 per million Tokens.

Jensen Huang gave an example in his speech. A research team uses 50 million Tokens every day. At $150 per million, that's $7,500 a day. It sounds like a lot, but for a research team, this amount of money is not a problem at all.

Because the value brought by Tokens far exceeds the cost.

More interestingly, Token budgets have started to be included in the daily operations of enterprises.

Jensen Huang mentioned that when recruiting in Silicon Valley now, "how much Token quota this job comes with" has to be written in the offer.

He said that in the future, every engineer will need an annual Token budget. The basic salary may be hundreds of thousands of dollars, but the company will also give an additional Token quota equivalent to half of the basic salary to enable them to achieve a 10 - fold increase in productivity.

Because every engineer who has access to Tokens will become more productive.

This is the business model of an AI factory:

Power is the cost ceiling, Tokens are the source of revenue, and the architecture determines how much product can be produced per watt.

Taking the Blackwell architecture as an example, compared with the previous - generation Hopper, with the same power, the revenue can be increased by 5 times. If it is replaced with the latest Vera Rubin architecture, the revenue can be increased by another 5 times.

In other words, in two years, for the same 1 - gigawatt factory, the Token output has increased from 200,000 per second to 700 million per second.

It has increased by 350 times.

This is not just a change in technical parameters; it's a change in the money - making logic.

In the future, every cloud service provider, every AI company, and every enterprise will focus on the same indicator: the efficiency of my Token factory.

This directly determines how much money you can earn.

Section 3 | Who Is Making This Money

When Tokens become a commodity, a new value chain emerges.

The flow of money becomes very clear: Enterprises buy Tokens → Inference service providers or GaaS companies → Cloud service providers → NVIDIA.

And on this chain, some people have already started to make money.

The first group is inference service providers

Companies like Fireworks, Together AI, and Lin have increased by 100 times in the past year.

What they do is very simple: build Token factories and then sell Tokens to enterprises.

Jensen Huang showed a set of data in his speech. When these inference service providers first accessed NVIDIA's software updates, the Token generation speed was about 700 per second.

After the update, it directly increased to 5,000 per second.

It increased by 7 times.

With the same hardware and the same power, the output directly increased by 7 times. For them, this means a 7 - fold increase in revenue.

This is why Jensen Huang said that the efficiency and performance of these Token factories are everything to them.

The second group is cloud service providers

Jensen Huang revealed a piece of data: 60% of NVIDIA's revenue comes from the top five hyperscalers, which are the largest cloud service providers.

However, this 60% also includes their own internal consumption. For example, recommendation systems and search engines are now shifting from traditional algorithms to deep learning and large - language models.

More importantly, NVIDIA's cooperation model with cloud service providers is very special.

NVIDIA not only sells hardware but also integrates the entire software library and brings customers to the cloud. For all those AI - native enterprises and all companies that need Tokens, NVIDIA helps them find cloud service providers.

In turn, cloud service providers are also actively asking NVIDIA to land the next customer on their cloud.

Because of this $1 - trillion investment and these Token factories under construction, they need customers to consume computing power.

The third group is SaaS companies

Jensen Huang said in his speech: Every SaaS company will become a GaaS company.

GaaS is Agentic as a Service.

In the past, SaaS companies sold tools. You paid, and they gave you a set of software for your employees to use.

In the future, GaaS companies will sell agents. You pay, and they give you a group of AI assistants to directly help you with your work.

The key to this transformation is Open Claw.

Open Claw is an open - source project that has become extremely popular in recent weeks. Jensen Huang said it is the most popular open - source project ever, surpassing Linux's 30 - year achievements in just a few weeks.

Its essence is the operating system for agents. Just as Windows enables us to create personal computers, Open Claw enables us to create personal agents.

However, there is a major problem with enterprise - level deployment: Agents will access sensitive information, execute code, and communicate externally. The combination of these three things means that it may send the company's financial data and supply - chain information out.

So, NVIDIA cooperated with Open Claw to develop the NeMo - Claw enterprise - level reference design. It includes a policy engine, network guards, and a privacy router to enable agents to run securely within the enterprise.

Jensen Huang said that every company now needs an Open Claw strategy.

Because the enterprise IT industry, which was originally a $2 - trillion industry, will now become an industry worth trillions of dollars. It is not only about providing tools but also about leasing agents.

The fourth group is enterprises themselves

Enterprises now have two roles: they are both buyers of Tokens and start to become producers of Tokens.

As buyers, enterprises are allocating Token budgets for employees to increase engineers' productivity by 10 times.

As producers, some enterprises are starting to build their own AI factories. They not only use AI but also find that they can directly reduce costs and improve efficiency.

Now this value chain is up and running.

Inference service providers are expanding production capacity, cloud service providers are building factories, SaaS companies are transforming into GaaS, and enterprises are purchasing Tokens.

The entire closed - loop from producing Tokens to consuming Tokens has been established.

Section 4 | Which Industries Are the Money Flowing To

When the Token factories start to operate, the most direct question is: Who is using these Tokens?

Jensen Huang mentioned a detail in his speech. At this GTC Conference, the industry with the largest proportion of participants was financial services.

He said, "I hope the participants are developers, not traders."

But there is a reason why so many traders came.

The financial industry is experiencing its deep - learning moment.

In the past, algorithmic trading involved quantitative analysts doing a large amount of human feature engineering and then running models with classical machine learning. Now it's different. AI supercomputers can automatically study massive data, discover patterns, and find trading signals.

This is a paradigm shift, just like when deep learning changed computer vision.

And the characteristic of the financial industry is that speed is money.

Every millisecond of delay may mean a difference of millions of dollars. This is why they are willing to pay for high - speed Tokens and use the top - level inference services.

The second industry that has seen results is manufacturing

Jensen Huang showed two cases.

Nestle makes thousands of supply - chain decisions every day. Their order - to - cash data mart aggregates information on every supply, order, and delivery in 185 countries.

With traditional CPUs, Nestle could only refresh the data a few times a day.

After switching to NVIDIA GPU acceleration, the speed increased by 5 times, and the cost was reduced by 83%.

Another example is Snapchat. They used NVIDIA to accelerate Google BigQuery, and the cost was directly reduced by 80%.

These two figures are very straightforward: 5 - fold speed increase and cost reduction of over 80%.

For manufacturing and Internet companies, this is not just a concept but an immediate and visible benefit.

The third industry is automotive

Jensen Huang announced four new RoboTaxi Ready partners in his speech: BYD, Hyundai, Nissan, and Geely.

Together, these four companies produce 18 million cars per year.

Plus the previously partnered Mercedes, Toyota, and General Motors, the number of cars supporting RoboTaxi will be very large.

More importantly, NVIDIA has also reached a cooperation with Uber to deploy and connect these RoboTaxi Ready cars to the Uber network in multiple cities.

Jensen Huang said that the ChatGPT moment for autonomous driving has arrived. We now know that we can successfully and automatically drive cars.

Every autonomous car is essentially a mobile Token factory. It needs to continuously recognize roads, judge distances, plan routes, and handle emergencies, and every action requires real - time Token generation.

The fourth industry is robotics

There were 110 robots at the GTC site. Jensen Huang said that almost every robot - manufacturing company in the world