In his New Year's speech, Huang made a big splash with a high "Huawei factor," directly using DeepSeek and Kimi to test the next-generation chips.
On the CES giant screen, Jensen Huang's PPT has become the "List of Deification" for Chinese AI. When DeepSeek and Kimi are in the center, a new era of computing power has arrived.
At the highly anticipated 2026 CES technology feast, a PPT instantly set the AI circle on fire.
During Jensen Huang's keynote speech, Chinese large - models Kimi K2, DeepSeek V3.2, and Qwen appeared on the screen, ranking among the top global open - source large models, and their performance is approaching that of closed - source models.
This moment is the glorious moment for Chinese AI.
In addition, OpenAI's GPT - OSS and Jensen Huang's own Nemotron were also marked.
Moreover, DeepSeek - R1, Qwen3, and Kimi K2 represent top - scale attempts under the MoE route. Only a small number of parameters need to be activated, significantly reducing the computational load and the pressure on HBM video memory bandwidth.
At the core stage of the debut of the next - generation Rubin architecture, Jensen Huang also chose DeepSeek and Kimi K2 Thinking to showcase performance.
With the powerful boost of Rubin, the inference throughput of Kimi K2 Thinking has directly soared tenfold. Even more amazingly, the token cost has plummeted to 1/10 of the original.
This "exponential" cost - reduction and efficiency - improvement is equivalent to announcing that AI inference is about to enter the real "affordable era".
In addition, on the PPT page about the soaring computing demand, 480B Qwen3 and 1TB Kimi K2 have become representative models, verifying that the parameter scale is scaling up by an order of magnitude every year.
It has to be said that the proportion of Chinese AI models in Jensen Huang's entire press conference exceeded the standard.
Ten - fold Inference Surge: Have Chinese Models Become Jensen Huang's "Favorite" AIs?
Coincidentally, in a blog post by NVIDIA in December last year, DeepSeek R1 and Kimi K2 Thinking were also used as benchmarks for evaluating performance.
Actual tests show that the performance of Kimi K2 Thinking on the GB200 NVL72 can skyrocket tenfold.
In addition, in the SemiAnalysis InferenceMax test, DeepSeek - R1 has reduced the cost per million tokens by more than tenfold. Models including Mistral Large 3 have also achieved a ten - fold acceleration.
This means that the deployment of complex "thinking - type" MoE in daily applications has become a reality.
Nowadays, if you pick any cutting - edge model and delve into its internal structure, you will find that MoE (Mixture of Experts) has become the mainstream choice.
Statistics show that since 2025, more than 60% of open - source AIs have adopted the MoE architecture. Since the beginning of 2023, this architecture has boosted the intelligence level of LLMs by nearly 70 times.
In addition, on the ranking list of the authoritative institution Artificial Analysis (AA), the top 10 most intelligent open - source models all use the MoE structure.
Such a large - scale MoE cannot be deployed on a single GPU, but NVIDIA's GB200 NVL72 can solve this problem.
The actual test results of DeepSeek R1 and Kimi K2 Thinking precisely prove the powerful performance of NVIDIA's Blackwell supercomputer.
Nowadays, Chinese large models are shining on the global stage. Their amazing performances have opened a new era of high - efficiency AI inference.
Leaders of Open - Source AI: Shocking the World
At the end of last year, Anthropic released a rigorous behavioral benchmark test for 16 global cutting - edge models.
Among these top - notch models, DeepSeek and Kimi are not only the only two Chinese models in the competition, but they have also delivered amazing results -
With an extremely low misleading rate, Kimi K2 Thinking has won the title of "the best - performing non - US model".
Note: The lower the score, the stronger the performance and the less likely to be misled.
This technological strength has quickly translated into international influence and real - world applications.
From the public praise of "the godfather of Silicon Valley venture capital" Marc Andreessen to the official announcement last month by the former CTO of OpenAI that his new product Thinker has connected to Kimi K2 Thinking, the hard power of Chinese AI is being accepted by the global core circle.
Authoritative evaluations further confirm this trend.
In the "2025 Annual Review of Open - Source Models" jointly released by well - known AI experts Nathan Lambert and Florian Brand, DeepSeek, Qwen, and Kimi have firmly occupied the top three positions.
Subsequently, Lambert conducted an in - depth analysis in a special article and highly praised the unique advantages of Chinese open - source AI.
1. The "Speed Advantage" of Open - Source Models
Although there is still a gap between the most powerful closed - source models and open - source models, Chinese laboratories are releasing models at an amazing speed, significantly narrowing this gap.
In the current era of rapid technological iteration, "releasing earlier" itself is a huge first - mover advantage.
2. From "Ranking High" to "User - Friendly"
Chinese models are performing increasingly well in benchmark tests, but the more crucial thing is the transformation from "high scores" to "usefulness".
We have witnessed the evolution of Qwen: initially known for "ranking high", it has now become a truly high - quality model.
Following this line of thought, K2 Thinking natively uses 4 - bit precision in the post - training stage, obviously to more efficiently support long - sequence RL expansion and make it more suitable for actual service tasks.
3. The Rise of Chinese Brands
At the beginning of the year, foreign users might not be able to name any Chinese AI laboratories; now, DeepSeek, Qwen, and Kimi have become representatives of Eastern technological strength.
Each of them has its own glorious moments and unique advantages. Importantly, this list is constantly growing, and Chinese AI is taking its place on the world stage.
4. Breakthroughs: Massive Tool Calls and Interleaved Thinking
The fact that Kimi K2 Thinking supports "stable tool calls for hundreds of steps" has sparked heated discussions.
Although this has become a standard feature in closed - source models such as o3 and Grok 4 (a natural emergence in RL training), it is among the first to be achieved through an open - source model, which places extremely high requirements on the precise support capabilities of hosting service providers.
In addition, there is "interleaved thinking" - that is, the model thinks during the intervals of tool calls.
This is a new trend that models emphasizing agentic ability are following after Claude, marking the further maturity of the model's logical chain.
5. Pressuring US Closed - Source Giants
The surge of open - source models has put great pressure on US closed - source laboratories - relying solely on benchmark test scores can no longer explain "why paid models are better".
In contrast, Chinese models may not have an advantage in terms of revenue for the time being, but they are cutting a larger and larger piece of the "mental share" in the global market.
Looking back at the speech at CES 2026, Jensen Huang directly made "open - source" the most hardcore main theme of the whole event.
The performance of Chinese open - source AI is amazing enough for the world. As more developers and enterprises embrace these models, the full - scale explosion of AI applications is just around the corner.
References:
https://blogs.nvidia.com/blog/mixture-of-experts-frontier-models/
https://www.interconnects.ai/p/kimi-k2-thinking-what-it-means
This article is from the WeChat public account "New Intelligence Yuan", author: Haokun Taozi. It is published by 36Kr with authorization.