HomeArticle

Let's talk about the business strategies of Token going global: If China opens its models to the world, what can it gain?

刘飞2026-03-26 15:49
After tokens become as essential as water, electricity, and gas.

01

There's an interesting AI news this weekend. Let me share it with you.

On March 19th, the AI programming tool Cursor released a new model called Composer 2, which is described as a "proprietary model" on its official website.

Cursor is currently the most popular AI programming tool globally. Essentially, it's a modified version of VS Code with deeply integrated AI capabilities (a similar one in China is ByteDance's TRAE). Since the release of Composer 1 in October 2024, the outside world has suspected that its model is a re - packaged one, but no evidence was found.

Now, the evidence is here. Less than 24 hours after the release, a developer @fynnso came up with a clever idea: he set up his own server as a model interface and then pointed the model address in the local Cursor to his server. In this way, the requests sent by Cursor were exposed: the model ID is kimi - k2p5 - rl - 0317 - s515 - fast.

The base model of Composer 2 is Kimi K2.5 from the dark side.

After the screenshots spread, Cursor quickly patched the loophole, but it was too late. Elon Musk also reposted the news to confirm it.

Finally, a Cursor executive responded, admitting the use of K2.5 but emphasizing that it was obtained through a legal authorization from the partner Fireworks AI. The Kimi official also confirmed this authorization chain. From a legal perspective, Cursor did not infringe on any rights.

There has been a lot of discussion about this incident, but I want to discuss it from another perspective.

02

In the past two years, there has been an underlying trend in the AI field.

In 2023, the mainstream approach for domestic AI startups was to fine - tune Meta's Llama. At that time, the industry consensus was that we were "two generations behind Silicon Valley".

In May 2024, DeepSeek released V2. This company, incubated from the quantitative fund Magic Square, used two technologies, MoE (Mixture of Experts) and MLA (Multimodal Learning Architecture), to significantly reduce the model's call cost. I wrote about the logic of MoE in my previous biography of DeepSeek. Simply put, instead of making the large model a generalist, it becomes a group of experts, and the one needed is awakened. MLA significantly reduces memory usage, and the video memory pressure is 67% - 90% lower than the traditional architecture.

At that time, people's impression of DeepSeek was mainly "cheap". When V3 was released in December, with new technologies such as FP8 low - precision training added, the official disclosed full - training cost was $5.576 million, about one - tenth of the training cost of Meta Llama 3.1, but its performance was basically on par with GPT - 4.

Then in January 2025, R1 was released.

I also talked about why R1 is important in my biography. The most core point is that it achieved the inference level of OpenAI o1 through pure reinforcement learning (pure RL), without the need for manually labeled question banks or supervised fine - tuning. It allows the model to play against itself and evaluate what a good answer is. This is not "doing what you've done with less money", but "taking a path no one has ever taken".

After R1, Altman of OpenAI first subtly mocked DeepSeek for "just replicating known work", but later admitted that "the emergence of DeepSeek has changed the situation where OpenAI has been far ahead in the past few years". It is reported that Meta established several special teams to analyze DeepSeek's methods.

This is the first wave.

The second wave comes from Kimi. At the end of January 2026, K2.5 was released. It is a MoE model with trillions of parameters, natively multimodal, and performs well in code generation, visual understanding, and Agent tool calls. The key is that it is open - source, using the Modified MIT license.

Shortly after its release, the call volume of K2.5 on OpenRouter (an aggregation platform for global developers to select and call AI models) reached the first place, ahead of Gemini 3 Flash and Claude Sonnet 4.5. Of course, at that time, K2.5 could be called for free in the OpenClaw ecosystem, which significantly boosted the call volume.

Three years ago, domestic companies were fine - tuning Llama. Now, top - tier tools in Silicon Valley are fine - tuning K2.5. The speed of this change has exceeded most people's expectations and is something many of us didn't anticipate before.

03

Now, we come to a more fundamental question: what exactly is the "supply chain" of open - source models?

Most people's understanding of "open - source" stops at: free download and personal use. They think the value of DeepSeek and Kimi is "bringing down the price for everyone".

First of all, this is certainly correct, but in the real business world, the circulation path of open - source models is far more than that.

Taking the Cursor case as an example, the complete chain is as follows:

Kimi open - sources K2.5 → The Silicon Valley inference service provider Fireworks AI obtains the authorization, conducts hosting, fine - tuning, and reinforcement learning training → Fireworks AI re - authorizes Cursor → Cursor packages it as Composer 2 and provides it to global developers.

There are technical services, authorization agreements, and commercial interest distributions at each level. This is still a commercial behavior, not a public - welfare one.

As a commercial behavior, the supply chain of open - source models is having a global impact, just like the Chinese supply chain in the physical manufacturing field in the past.

For a Uniqlo garment, from yarn to fabric to the finished product, the supply chain is also in China. The global market has a deep dependence on the Chinese supply chain for new - energy vehicle batteries, photovoltaic modules, and rare - earth processing.

This dependence is formed through decades of accumulated cost advantages, engineering capabilities, and economies of scale. Global brands choose the Chinese supply chain not because of personal preferences but because of economic considerations, that is, lower cost for the same quality and faster delivery for the same cost.

A similar phenomenon is emerging in the AI field. The raw materials are not steel and cotton but model weights and inference computing power. Global AI application - layer companies are starting to choose Chinese open - source models as the base, driven by a simple reason: they are good and cheap.

There is a well - known precedent in the technology field: Android. Google open - sources AOSP, Qualcomm adapts the chips, Samsung and Huawei customize the devices, and operators handle the channels. What users get is a Samsung phone, but the underlying logic, API specifications, and ecological standards of the operating system are defined by Google. Each layer in the supply chain makes money, and the layer that defines the base has considerable influence.

Of course, this is just a possible direction, not a fait accompli. There is still a long way to go.

04

When it comes to the AI supply chain, we naturally have to mention the first AI - booming field at the beginning of 2026: lobster - raising.

OpenClaw is an open - source Agent framework, the work of Austrian developer Peter Steinberger. Lobsters need a "brain", or rather, they need to be fed. OpenClaw itself is a framework and does not provide a model, so users have to choose one themselves.

K2.5 has become the main model recommended by OpenClaw. Large companies have followed suit. ByteDance's ArkClaw, Tencent's QClaw, Zhipu's AutoClaw, MiniMax's MaxClaw, Alibaba's CoPaw... were launched intensively in March 2026. Among the models with the highest underlying call volume are K2.5, DeepSeek, Qwen series, and MiniMax. Open - source models continue to dominate the token traffic.

This link has some similarities with the physical supply chain. Foxconn manufactures products for Apple, Huawei, and Xiaomi. No matter which company's phones sell well, Foxconn makes money because it is at a low - enough position in the supply chain.

If the Cursor incident exposes the story in the B - end supply chain, the lobster ecosystem shows the story in the C - end supply chain. Both links point to the same fact: the position of the base model is becoming more and more like infrastructure.

We can also see from the lobster that the narrative of infrastructure is gradually becoming a reality. Tokens are the water, electricity, and coal of the future AI era.

How big is the market for this "water, electricity, and coal"? Here is a set of data for reference.

According to the statistics of Huatai - Peregrine Fund, China's overall daily Token consumption increased from about 100 billion at the beginning of 2024 to over 30 trillion in the middle of 2025, and reached 180 trillion in February 2026. Agent applications like lobster run around the clock, and the Token consumption is several orders of magnitude higher than that of traditional Chatbot conversations.

On March 16th, Alibaba announced the establishment of the Alibaba Token Hub (ATH) business group, on par with e - commerce and cloud intelligence, led directly by CEO Wu Yongming. The entire business group focuses on one thing: creating Tokens, delivering Tokens, and applying Tokens. Tongyi Laboratory builds models, the MaaS business line builds platforms, Qianwen targets the C - end, and the newly established Wukong Division targets the B - end.

The word "Token" was previously only used in the technical community, but now a company with a trillion - dollar market value uses it to name its core business group.

If Tokens are really becoming the water, electricity, and coal of the AI era, then those who can stably provide a large number of Tokens at a low cost will have a place in this ecosystem. Open - source models have a natural advantage in this regard: flexible deployment, controllable cost, and no dependence on a single supplier. Open - source models like DeepSeek and Kimi, which bring down the cost while maintaining performance, are like low - cost power plants in this market. They will be very important players in this market.

05

Why are Chinese open - source models popular?

Cloudflare conducted a test. Replacing other models with K2.5 on the Workers AI platform reduced the inference cost by 77%. The data disclosed by Cursor itself also explains the selection logic: Composer 2 has slightly lower performance than GPT - 5.4 but a faster generation speed and the lowest cost. For a company with an annual revenue of $2 billion, it's easy to do the math.

Looking at the lobster ecosystem, the pricing of K2.5 on OpenRouter is about $0.5 per million input tokens and $2.8 per million output tokens. For Claude Sonnet 4.5, it's $3 and $15 respectively. The difference is six to seven times. The lobster use - case involves high - frequency calls, and a complex task may require hundreds or even thousands of steps. In this scenario, a six - fold cost difference is not just about "saving a little", but about "whether it can be supported to run".

This is in line with the price foundation laid by DeepSeek. V3 brought the price per million tokens down to single - digit RMB, and R1 even reduced the price of the inference model to a fraction of OpenAI o1. When I wrote the biography of DeepSeek, I mentioned that such a price difference in any market would cause a huge shock. Imagine the impact if a phone that originally cost 26,000 yuan is now sold for 1,000 yuan.

Being cheap alone may not be enough.

DeepSeek provides services at that price that are on par with the industry's top products. The same is true for K2.5. Cursor's Composer 2 scored higher than Claude Opus 4.6 on Cursor's official test CursorBench, and its base model is K2.5.

This may seem to imply that K2.5 is stronger than Claude, but of course, we can't say so. After all, it may be different from most people's experience with ChatBot.

Cursor's vice - president Lee Robinson mentioned in the response that only about 1/4 of the computing power of the final model comes from the base, and the remaining 3/4 is from Cursor's own continued pre - training and large - scale reinforcement learning.

Co - founder Aman Sanger further explained that the team evaluated multiple base models, and K2.5 performed the best in programming - related indicators. Then, on this basis, they conducted continued pre - training for the programming scenario (adjusting the task distribution and ability focus) and 4 - times computing - power reinforcement learning training. After these processes, the performance of Composer 2 on various benchmarks is "very different" from the original K2.5.

In other words, Cursor chose K2.5 not because it is "smarter than Claude", but because it has the best potential as a base in the programming direction. After a large amount of targeted training, it can achieve a high cost - performance ratio, approaching the top - tier closed - source models but at a much lower cost.

This is actually the value of the entire open - source ecosystem: there is no need to train a model with hundreds of billions of parameters from scratch. By taking a strong base and conducting in - depth optimization for vertical scenarios, it can compete with closed - source giants in specific tasks. Cursor is not the only one doing this. Cognition's Windsurf also adopts a similar approach.

The cost - reduction space opened up by DeepSeek has been further extended by K2.5 in two key scenarios, Agent and code, forming the basic narrative of the Chinese AI supply chain. After the release of Kimi's K2.5, it received extremely high attention. Its revenue in 20 days exceeded that of the whole year of 2025. Overseas revenue exceeded domestic revenue for the first time. Its valuation increased from $4.3 billion to $18 billion within three months.

When it comes to valuation, there is a comparison worth considering.

There are rumors that Cursor's new round of financing values it at $50 billion. Its valuation history is as follows: $50 million in October 2023, $400 million in August 2024, $2.6 billion in December 2024, and $29.3 billion in November 2025. It has experienced rocket - like growth.

The narrative supporting this growth is very important: "We have our own model R & D capabilities." Both Composer 1 and Composer 2 strengthen this story.

However, Kimi, which provides the base model, has a valuation of $18 billion, about one - third of Cursor's target valuation. In the context of the supply chain, it's like a brand - owner's market value is three times that of its core supplier, but the core of the brand - owner's product comes from this supplier. It's not to say that this ratio is necessarily unreasonable. Cursor's product strength, user stickiness, and business model do have their own value, but at least it shows that there may be a cognitive time - lag in the market's pricing of the "base" and the "shell".

This is not the only case like this. Manus, which was very popular some time ago and focuses on AI Agents, also doesn't have its own underlying model and completely relies on third - parties. Just because its products and scenarios are recognized, Meta offered a price of $2 billion.

What's more worthy of attention is the horizontal comparison. Kimi's $18 billion valuation is about 2% of OpenAI's and