The Craziest Week for AI: Eight Key Consensuses You Need to Know

A grand AI event attended by a thousand people reveals the truth about domestic AI.

In just 8 days, a flurry of blockbuster developments rocked the global AI industry. With its dense release pace, high information density, enormous capital volume, and wide range of involved players, the period was nothing short of breathtaking.

From April 16 to 24, 9 cutting-edge models including Anthropic Claude Opus 4.7, Alibaba Qwen3.6-Max, Moonshot AI Kimi K2.6, OpenAI ChatGPT Images 2.0, Ant Group Ling-2.6-flash, Xiaomi MiMo-V2.5-Pro, Tencent Hy3, OpenAI GPT-5.5, DeepSeek-V4 were released in rapid succession.

During the same period, Amazon and Google announced one after another that they plan to invest $25 billion and $40 billion in Anthropic respectively. Elon Musk's SpaceX announced plans to acquire AI programming unicorn Cursor for $60 billion, while rumors that DeepSeek has launched external financing have also been swirling widely.

When these major events of the week are connected, they reveal 5 clear trends:

The core battlefield of AI competition has shifted from "chatting" to "getting work done";
The top AI echelons of China and the US have basically formed, and both continue to advance aggressively;
Chinese AI demonstrates unique competitiveness in open source and cost efficiency;
Computing power infrastructure will become a key factor affecting the pace of the AI race;
A new multi-faceted industrial relationship of "investment + competition + cooperation" is taking shape.

During this period, the 2026 China Generative AI Conference (Beijing Station) was successfully held from April 21 to 22.

Organized by Zhidx and co-hosted by Zhixingxing, the conference gathered 73 guests from industry, academia, research and investment. Centered on the theme of "Towards AGI, Reshaping the Future", through 1 opening ceremony, 3 special forums, and 6 technical seminars, it provided a panoramic analysis of the industrial context, innovation paradigms, Token economy, and China's opportunities in the AI industry.

Topics covered a wide range, from cutting-edge models and applications such as large language models, multimodal models, world models, agents, and AI glasses, to infrastructure including data, chips, storage, communication, and cloud services.

Guests shared their views freely, discussed pain points, made predictions, grounded in the present and explored the future. The richness of shared content was truly mind-stretching.

A clear consensus is that the Chinese AI battlefield has expanded from the model layer to the ecosystem layer.

We have sorted out the key insights shared by guests from the opening ceremony and three special forums, hoping it will be inspiring for you.

1. How can large models grow stronger? Reaching vertical domain expert level is only a matter of time

2. Beware of pitfalls in "lobster" farming and Token purchases! Let's talk about what large model service providers won't tell you

3. 6 counter-consensus insights from the Claude Code source code leak

4. After OpenClaw, where are China's opportunities in the age of agents?

5. Multiple paths for world models: video generation, native multimodal unification, 3D generation

6. Amid explosive Token consumption, how can Chinese AI infrastructure collaborate and evolve?

7. In the second half of the large model era, competition focuses on scenarios, data, and taste

8. From "lobster" agents, AI glasses to Token management, dissecting the wave of Chinese agent deployment

01. How can large models grow stronger?

Reaching vertical domain expert level is only a matter of time

At the opening ceremony, Zhao Xin, professor at the Gaoling School of Artificial Intelligence, Renmin University of China, centered his speech on a fundamental question: How can large models grow stronger?

First, large-sized models still have significant performance advantages. The pre-training paradigm of predicting the next token can build very strong foundational capabilities. For post-training, an important direction is RLVR (Reinforcement Learning based on Result Supervision), which can improve model capabilities for vertical domains, providing another scaling route beyond "next token prediction supervised training", and building a feasible training path for complex agent environments.

Next, enable large models to use tools, such as searching for information, solving problems via programming, etc. Code execution-enhanced inference chains make the problem-solving process more concise and clear, significantly improving inference efficiency.

As task complexity increases, models require a large number of interaction rounds, so context window management becomes a challenge. There are two approaches: one is autonomous context compression by the model, the other is using files as external storage for context.

In multimodal deep search scenarios, if content such as images and videos are directly tokenized, the context will expand dramatically. A solution is to write search results to the local file system first, while generating a short summary to enter the context window, and load it on demand later when needed, enabling stable hundreds-round multimodal search.

▲ Zhao Xin

Give a large model a virtual computer (such as a terminal/sandbox), and it can unleash general capabilities in non-code domains.

Large-scale training requires a large number of diverse environments, and it is unrealistic to manually create tens of thousands of environments. Recent research has significantly increased the introduction of agent-based data. For example, DeepSeek-V3.2 significantly increased the number of constructed agent tasks and supporting environments in reinforcement learning. These simulation environments aim to synthesize large-scale training data at low cost and high efficiency.

How to simulate complex environments? Most agent operations only require a lightweight sandbox. If you adopt a Docker-based approach to train code agents, the scalability bottleneck will be concentrated at the Docker execution layer. At this time, you can use large models instead of Docker to provide execution feedback, to reduce the demand for creating real Docker containers.

The core challenge of current multi-agent systems lies in the stability of long-horizon tasks, which requires appropriate workflow design.

In response, the AiScientist system open-sourced by the Gaoling School of Artificial Intelligence separates the decision layer from the expert layer: the orchestrator focuses on stage-level decision making, and experts are responsible for complex subtasks, letting files act as the "bus" for agent coordination, enabling complex orchestration and collaboration.

Finally, Professor Zhao Xin shared three predictions:

1. Model capability expansion is limited by human cognition. There are still limited ways to utilize existing computing power, and breakthroughs require new scaling paradigms.

2. It is only a matter of time before large models reach the level of vertical domain experts, but general AGI remains challenging. It is estimated that 1 to 2 more important training scaling paradigms like "next token prediction" and RLVR need to emerge to promote the realization of general AGI.

3. Most innovations are engineering innovations, AGI requires more fundamental innovations. The development of model capabilities and Harness grows spirally and adapts to each other: infrastructure complements the shortcomings of models, and after models improve, they promote the evolution of infrastructure. Technology is difficult to form a lasting moat; talent and data are far more critical.

02. Beware of pitfalls in "lobster" farming and Token purchases!

Let's talk about what large model service providers won't tell you

Shi Tianhui, co-founder of Qingcheng Jizhi, exposed the current chaos in the Token industry — there are many hidden pitfalls when buying Tokens.

For the same model, if you purchase it from different service providers, you get different performance, different final costs, and service quality can vary dramatically.

Once during a test, they found that a certain service provider's model had obvious problems. When asked, the service provider admitted they were using int4 quantization. This quantization can push costs extremely low, but results in very poor model performance.

Service providers offering the same model at a lower quote may actually end up costing users more in total, because of differences in cache hit rate.

Cache hit rate is an indicator that has a huge impact on total cost. Due to different technologies, cache hit rates vary greatly between service providers: good ones can exceed 80%, while poor ones have cache that is almost completely useless.

But service providers never tell customers this.

▲ Shi Tianhui

The AI Ping team tested 600 large model API services from more than 30 domestic Chinese service providers, including model vendors, large internet companies, listed cloud companies, and MaaS vendors.

According to their tests, when large service providers (such as cloud vendors and telecom operators) provide the same model services at similar prices, service performance can differ by 5 times or even more.

They observed that the service quality of various domestic service providers has deteriorated significantly compared with the end of last year. Many service providers cannot guarantee quality for small and medium-sized customers, with slow responses and obvious performance issues.

The boom of "lobster farming" and other industry trends has led to a shortage of Tokens, which are now expensive and slow. Token services are also a black box; the industry is growing fast but also very chaotic. So how do you choose the right Token service provider?

Shi Tianhui introduced AI Ping, developed by Qingcheng Jizhi: it conducts comprehensive evaluation and summary of large model service indicators that users care about, and provides filtering, sorting and intelligent routing functions, allowing users to conduct 7x24 evaluation of different large model API services and call services on demand.

03. From the Claude Code source code leak,

we summarize 6 counter-consensus insights

Li Bojie, chief scientist of Pine AI, focused on sharing his takeaways from the leaked Claude Code source code.

In his view, the five-layer permission judgment, error recovery, security protection toolset, anti-distillation defense mechanism and other designs in Claude Code's source code, as well as the approach of building a product using research methods, are all very worthy of learning.

▲ Li Bojie

Li Bojie also shared 6 counter-consensus insights he summarized from the leak:

1. The value of graphical user interfaces (GUI) will gradually decrease, and software value is shifting from interface to data governance. SaaS without data barriers will most likely be eliminated.

Agents read and think much faster than humans, but operate GUI slower than humans, so GUI is not friendly to agents.

Claude Code is a typical product form with low GUI value and high business logic and data governance value: there is no production-grade GUI in its 510,000 lines of source code.

2. Context is the moat for humans to avoid being replaced by AI.

The context AI can access is far lower than that of a human employee: things like design intentions discussed over dinner, pitfalls in legacy code, unspoken inner thoughts.

Providing the right context to AI is also the key to using AI well.

3. The essence of an AI-native organization is replacing the traditional top-down hierarchical structure with AI.

Companies like Anthropic and Kimi have significantly reduced middle management, achieved good context sharing, letting senior management see frontline signals, and letting frontline staff directly see strategic context.

The old joke of "cut off the hands of senior management, cut off the seats of middle management, cut off the brains of frontline staff" no longer holds true in AI-native companies.

4. Who will be replaced by AI?

The top (high-value decision-making and creation) and the bottom (work deeply integrated with the physical world) are relatively safe, while the middle (execution-oriented, standardized work without generalization capability) is the most at risk.

AI is an amplifier of technical capabilities, pushing up the proportion of non-technical capabilities. The ability to learn new knowledge and adapt to new scenarios is critical.

One-person companies (OPC) don't mean one person can develop an app. For most independent developers, the bottleneck is not writing code, but scarce capabilities such as customer acquisition, building trust, and operations.

5. "Model is Agent" is far from enough.

A real agent has a bunch of complex Harness to cover and solve the parts that the model can't handle, with far more code than the tools and prompts themselves. Only when the model company also controls the application layer and the fallback engineering.

Agent= Model x Harness. Large models provide the "brain", while Harness provides the "hands and feet" and "reins", including how context is provided, how tools are called, how errors are recovered, how security is guaranteed, how cache is shared, how parallelism is coordinated, etc.

6. The moat of application-layer companies lies outside of technology.

The "legacy technical debt" in Harness reflects the tension between the internal model team and application team of a model company, it is a short-term technical lever for the application layer, but the technical advantage will be eroded by the flywheel of model companies.

The gap between top-tier models will continue to widen, and mid-tier models will become commoditized. The long-term moats for application-layer companies are data, channels, brand, user trust, network effects, etc.

04. High-level Dialogue: After OpenClaw,

where are China's opportunities in the age of agents?

The high-level dialogue session was moderated by Zhang Guoren, co-founder and editor-in-chief of Zhidx. The three guests are all prominent figures in the agent field: Huang Chao, assistant professor and PhD supervisor at the University of Hong Kong, leader of the Nanobot team; Wang Ning, head of LobsterAI project and head of intelligent hardware R&D at NetEase Youdao; and Chen Shi, investment partner at Fresco Capital.

Huang Chao's team's open-source project Nanobot implemented the core functions of the original OpenClaw (which has 430,0

该文观点仅代表作者本人，36氪平台仅提供信息存储空间服务。

The craziest week for AI. Here are the eight key consensuses you need to know.

01.

How can large models grow stronger?

Reaching vertical domain expert level is only a matter of time

02.

Beware of pitfalls in "lobster" farming and Token purchases!

Let's talk about what large model service providers won't tell you

03.

From the Claude Code source code leak,

we summarize 6 counter-consensus insights

04.

High-level Dialogue: After OpenClaw,

where are China's opportunities in the age of agents?