HomeArticle

Interns do the work, supervisors give guidance, and AI can be "smart on demand".

爱范儿2026-04-13 08:55
For AI companies, this means a new dimension of competition. It's no longer just about competing in a single dimension, such as parameters or price. Instead, it's about competing in terms of intelligent cost - effectiveness.

Really want to swim over and tell Anthropic to stop being so competitive. They've released new features again and again:

The Advisor strategy has been added to Claude's API toolbox. When Sonnet or Haiku encounters difficulties during operation, it will consult Opus for solutions and then continue running. All operations are completed in a single API request.

To be honest, this approach is not new in the developer community. Many (cash - strapped) developers do this in their own workflows. They use inexpensive models to handle simple tasks and only call expensive models when in - depth reasoning is required. However, they need to manually switch models. Now Anthropic has productized this "poor - man's" workflow into an official feature that can be enabled with just one line of code.

"Interns" do the work, "Directors" give guidance

In a regular AI agent architecture, the common practice is to let the most powerful model be in charge, break down tasks into smaller pieces, and assign them to cheaper models for execution. The strong model manages from above, and the weak models execute below, creating a top - down structure.

Anthropic's advisor does the opposite. The weak model takes the lead, and the strong model acts as a consultant.

Specifically, Sonnet (or the even cheaper Haiku) acts as the "executor" and runs the task throughout - calling tools, reading results, and iterating. When it reaches a decision - point where it's unsure - for example, at a fork in the code architecture, whether to use plan A or plan B? It won't make a wild guess. Instead, it will actively initiate a "raise - hand - to - ask" tool call, sending the current context and specific question to Opus.

As the "consultant", Opus won't directly participate in the execution after reviewing. It won't write code or modify logic, but only return a brief suggestion (usually 400 - 700 tokens): "Use plan A because of XYZ. Pay attention to part Z." After receiving the suggestion, Sonnet continues to execute. The entire process is invisible to the user.

To put it in an easier - to - understand way: It's like an intern who, when encountering difficulties, raises a hand to ask the director for advice. The director gives a direction, and the intern continues to work. The director is paid according to the director's salary (Opus price), but only says a few words, so the cost is low; the intern works throughout (Sonnet/Haiku price), but the unit price is not high, so the total cost is very low.

Anthropic's own test data shows:

- Sonnet + Opus advisor scores 2.7 percentage points higher than Sonnet alone on SWE - bench Multilingual, while reducing the cost by 11.9%. - Haiku + Opus advisor scores 41.2% on BrowseComp, more than twice that of Haiku alone (19.7%), but the cost is only 15% of Sonnet. - The CEO of Bolt commented: "The architectural decisions for complex tasks are significantly better, and there is no additional overhead for simple tasks." - A machine - learning engineer at Eve Legal said: "In the structured document extraction task, Haiku dynamically upgrades its intelligence through the advisor, achieving the quality of cutting - edge models and reducing the cost by 5 times."

Even interns can't be stupid

This model has an easily overlooked premise: The intern must be smart enough to accurately judge when it can't handle a task.

This is essentially the "backbone" of the advisor and the prerequisite for the entire function to work. A really poor model doesn't even know what it doesn't know and may confidently choose the wrong solution, being fearless out of ignorance. In this case, the advisor call won't be triggered, which is more dangerous than using a poor model throughout. Since the dialogue only occurs between models and won't be presented to the user, you may think Opus has everything under control and don't need to worry, but in fact, it has never been called.

This is why Anthropic's advisor tool currently only supports Sonnet and Haiku as executors, not any model. These two models have been fully trained within the Claude family and know when to ask for help and when they can handle things on their own.

The benefits of this strategy are obvious: It lowers the user threshold, allowing users to adopt best practices without engineering knowledge. However, it also has a subtle side effect: It saves money, but not completely.

When developers build their own model routing, they can freely combine models from any company: use DeepSeek for screening, GPT - 5 for reasoning, and Gemini for summarization, choosing the cheapest option. This is an open, fully self - designed, cross - platform money - saving strategy.

However, the advisor tool only supports models within the Claude family. The executor must be the "in - house" Sonnet or Haiku, and the consultant must be Opus. You can't use GPT as a consultant or Gemini as an executor.

Theoretically, any model can call another model through a tool call. Anthropic's approach is entirely due to product strategy.

Since it's not an original idea, can I engineer a similar solution by combining different models? The advisor tool is a scenario where there is interaction between models. Suppose you want DeepSeek to take the lead and send the context to Claude for advice when encountering difficulties. After Claude responds, DeepSeek continues to execute, but several problems may arise:

- Output format: The structured way Claude returns suggestions (e.g., wrapping plan steps with XML tags) may not be accurately parsed and followed by DeepSeek. The formats are aligned between in - house models.

- "Language" differences: Each model has its own preferred way of thinking and expressing. The suggestions Opus gives to Sonnet use wording and logical structures that Sonnet can easily understand and execute. The suggestions Claude gives to DeepSeek can be understood by DeepSeek, but the execution accuracy will be compromised - just like a native English speaker giving instructions to a non - native English speaker with good English. Most of the time it's okay, but there will be understanding deviations in subtle places.

- Tool use format incompatibility: There are slight differences in the function calling formats of different vendors. The tool use of Claude and DeepSeek is not exactly the same in terms of JSON schema and parameter passing methods. This is what happens when building an agent link across vendors, and format conversion in the middle is a pitfall.

It's naturally best to use the official native solution. However, once you add the advisor to your workflow, it means you're locked into the Claude ecosystem. This is a sophisticated business design. Anthropic doesn't stop you from saving money; in fact, it helps you save. But it uses the act of "saving money" to strengthen platform stickiness.

Is it really cost - effective? After all, DeepSeek costs RMB, while Claude burns US dollars.

The community did it first, Anthropic just added a button

The idea of Advisor is not new. It's more like Anthropic has packaged an existing practice into an official product.

The most common money - saving technique in the developer community is "model routing": use cheap models to handle simple tasks (classification, summarization, formatting) and only call expensive models when in - depth reasoning is required. There are many open - source projects implementing this model.

Many of the features we commonly use now were once manually assembled by the developer community. The "Project" project space, which is almost a standard feature in various AI platforms now, used to be created by developers manually splicing system prompts + document context at the API level, with all content centered around one project. The prototype of Artifacts was also developers using Claude/GPT/Gemini to generate HTML/React components and then manually pasting them into the previewer for running. Now this function can also be implemented in the terminals of various platforms.

Recently, Anthropic launched Dispatch (remote control of the desktop from the phone) and Channels (IM integration), which is Claude's own OpenClaw. Then, it turned around and banned OpenClaw.

From a product - strategy perspective, the newly released Advisor strategy reflects that in the context of tight computing power, cost - effectiveness is not only what users pursue but also a pain point for companies. It's a larger trend in the AI pricing model.

Regular AI pricing is simple. You choose a model and pay per token. If you want the highest quality, use Opus; if you want it cheap, use Haiku. Once you've made the choice, that's the price. Essentially, you're buying "computing - power time", the processing capacity for a certain number of tokens.

The Advisor model is more flexible. You no longer choose a fixed level of intelligence. Instead, the system dynamically allocates intelligence according to task complexity, and vendors can also handle and allocate computing - power resources more flexibly.

For end - users, this is a good thing. You don't have to pay the full Opus price for a task that only requires a low level of intelligence. But for AI companies, this also means a new dimension of competition. It's no longer just about competing on parameters and prices; it's about competing on the cost - effectiveness of intelligence.

This article is from the WeChat official account "APPSO". Author: Selina, Editor: Li Chaofan. Republished by 36Kr with permission.