HomeArticle

Analyzing Agents from Andrew Ng's Signals: China's AI Opportunities Lie in Execution Rights, Not Models

闫峻2026-06-08 19:01
The commercialization challenge of Agent lies not only in model capabilities, but also in whether enterprises are willing to delegate part of their business execution authority to AI.

Judgment: When AI moves from answering questions to doing tasks for people, the real commercialization difficulty is no longer just model selection, but whether enterprises dare to hand over part of their business operations to it.

Andrew Ng pointed out in his book a key point: To improve an AI system, instead of frequently changing models, it's better to figure out when the system makes mistakes, why it makes mistakes, and where to start making improvements. This idea has guided the optimization of image recognition and search ranking in the past.

Now, when we talk about agents, the connotation of this logic has changed: When the task of AI changes from "answering questions" to "doing tasks for people", the measurement standard has changed from model scores to whether enterprises dare to let it do the work and who will be responsible if something goes wrong after it does the work.

This is the essence that Chinese AI entrepreneurs should most clearly understand behind the current Agent boom.

Under the interface lies new executive power

In the past few decades, the logic of enterprise software has always been "people operating the system". Whether it's CRM, financial systems, or code libraries, software only records, standardizes, and precipitates, while it's always people who make judgments, click confirmations, and take responsibilities.

The profound change brought by Agent is that software is trying to shift from "recording what you've done" to "doing part of the work for you". This is far from as simple as changing buttons to dialog boxes. Its core is a transfer of power - what we call executive power.

It means that the system starts to take over, advance, and even complete those business operations that originally had to be clicked by people: creating a new customer record, initiating a payment approval, sending an external email, and submitting a line of code modification. Once an operation crosses the red line of "suggestion" and enters the realm of "execution", then what determines success or failure is no longer the quality of the model, but the rules in the company about "who has the right, who approves, and who bears the consequences".

Andrew Ng's recent discussions about agent workflows, context hubs, and programming agents carry this kind of signaling significance. The focus of competition is quietly shifting from "who can speak beautifully" to "who can break down tasks, call tools, check results, iterate until completion, and ensure that the whole process can be traced, accepted, and someone can be held accountable if something goes wrong".

Whether a single task is done well is a matter of the model; whether a specific step in a process can be completed is a matter of business. The former answers "can it be done", while the latter has to answer "after it's done, will the company recognize this account?"

Companies won't easily grant executive power

Behind any tiny operation in an enterprise, there is a set of rules nested: who has the permission for this matter, what the basis is, who needs to approve it, whether the operation leaves a trace, and who will take the responsibility in the end. This is the underlying logic of a company's operation.

Many Agent products get stuck here. They can come up with solutions but can't push the process forward; they can summarize information but can't mobilize the system; they can write code but can't understand the company's internal rules; they can answer questions about regulations but don't know if the regulations have changed; they can give suggestions but won't take responsibility if something goes wrong.

The bottleneck is often not that the model isn't smart enough, but that the Agent simply doesn't know how the enterprise's internal processes operate. It doesn't know which internal interface has changed, isn't clear which data field can't be touched, doesn't know at which step a leader's approval is needed, doesn't understand whether this customer information can be sent out, and doesn't know how to roll back, remedy, and hold someone accountable after an error occurs.

Taking programming agents as an example, Andrew Ng pointed out that they will "guess blindly" and make random calls because they can't get the latest and most accurate API documentation. For agents that need to work in an enterprise, this problem is even more fatal: they are faced with the systems of various departments, complex approval rules, legacy data, and a lot of internal regulations. For working agents, accurate information is not just reference material, but the "construction drawing" for them to start working. Without this drawing, executive power is out of the question.

It's the "control layer", not the chat box, that can approve the budget

This points to a more basic and perhaps less "flashy" opportunity: building an internal control layer for the enterprise. Connect and manage all the scattered knowledge bases, rules and regulations, operation manuals, API documentation, permission systems, and business systems, making it an action guide that agents must understand before starting work and must never cross the boundary.

In the past, knowledge bases solved the problem of "enabling people to look up information", while in the Agent era, the control layer needs to solve the problem of "enabling AI to work based on the right information and being able to explain afterwards why it did so, what it used, and whether it crossed the boundary".

The context layer solves the problem of "knowing what to do", and the control layer solves the problem of "being allowed to go to what extent". A company won't easily put a chatty thing into its core processes, but may pay for a system that can manage context, permissions, interfaces, and operation records. This ability may not be a good story to tell, but it's easier to get customers to approve the budget and go through the payment process.

The front - end agents may be very cool, but the "connection - control - audit" chain that helps agents run safely and compliantly in the company may generate real revenue earlier.

The essence of a process is a responsibility chain, not a dialogue flow

For an agent to generate value, it must be able to access real - world systems such as ERP, CRM, OA, finance, and supply chain. This path may not sound exciting, but it's more likely to get customers to pay than creating another chat entry.

A sales agent has limited value if it can only generate sales scripts; it can only be considered to have completed the sales process if it can read the customer stage in the CRM, automatically call the quotation system, generate to - do tasks, and write them back to the system.

A financial agent has limited value if it can only explain regulations; it can only be considered to have completed the financial process if it can automatically reconcile invoices and contracts, automatically warn of abnormalities, initiate the approval process, and leave audit traces.

A customer service agent is just a replacement for the knowledge base if it can only answer questions; it can only be considered to have completed the customer service process if it can automatically judge problems, check orders, check permissions, and create or transfer work orders.

Its value doesn't come from "being like a human", but from being a reliable link in the company. "Reliable" means: knowing what can and can't be done, leaving records of all operations, being able to trace back if something goes wrong, and being able to withdraw.

So, the agents that can really sell in the early stage are likely to come from very specific business segments: the closed - loop of customer follow - up for foreign trade salespeople, the online customer complaint handling of chain stores, the equipment early - warning and order dispatching of factories, the internal compliance inquiries of financial institutions, and the internal code assistants of R & D teams. They don't have the grand story of "general intelligence", but are more likely to receive payment. More importantly, they are easier to sell repeatedly to similar customers.

Doing segmentation doesn't mean making the market smaller, but rather calculating the accounts first: who to serve specifically, what specific tasks to do, and what the cost of doing it once is. Only when a company can continuously acquire ten customers in the same scenario and the implementation cost for each customer gets lower and lower can large - scale development see the light of day. If each customer has to be customized from scratch, the system has to be re - connected, and what constitutes acceptance has to be re - defined, the business will be too burdensome.

The future of programming agents: beyond just writing code

Andrew Ng said that programming agents have very different acceleration effects on different development tasks. In China, this is even more obvious because a lot of valuable things are hidden in the company's private code libraries, internal legacy components, outdated documents, old bugs, and old systems that no one dares to touch, not on the public GitHub.

So, the opportunities for Chinese programming agents go far beyond "helping programmers write code", but lie in getting into the actual R & D processes of enterprises: being able to understand internal code and dependencies, being able to view problems in combination with historical work orders, being able to assist in writing tests, being able to leave records of who changed what, and ensuring that sensitive code is never leaked.

Customers may not be willing to pay for an agent that can only write code; an agent that can reduce repetitive labor, reduce the testing workload, help the team sort out old code, and ensure that everything runs within a safe range is more likely to be on the procurement list. The difference between the two still lies in those four words: accessing the business.

AgentOps is not just operation and maintenance, but executive control

When an agent really starts to operate on behalf of people, customers will immediately ask: What did it do? Why did it do so? Which tool did it call? Did it act recklessly? Can it roll back if it makes a mistake? Were the operations recorded? Who told it to do so? Who got the results?

These questions seem to be operation and maintenance problems on the surface, but in essence, they are control - right problems. Traditional operation and maintenance is to ensure that the system doesn't crash, while the operation and maintenance for agents is to ensure that they don't overstep their authority and don't get out of control. It's not an additional function, but a prerequisite for agents to go online and work.

In fields such as finance, government affairs, and healthcare, agents without strict permission control, operation logs, audit tracking, and roll - back mechanisms can only be used for demonstrations. Only those agents that can be monitored throughout the process, managed, and shut down immediately if something goes wrong have the possibility of being really used.

From product to company: the hurdle of commercialization

After the Agent craze subsides, investors won't just look at "whether there is an agent" when evaluating projects. They will ask more practical questions: In which specific part of the company is it used? Who decides to pay? How to measure the effect? Can it really save manpower? Is it project - based, annual - fee - based, or charged by usage? Who will be responsible if there are problems during use?

For a startup company doing Agent business, the most valuable thing may not be the model, but a deep understanding of the business processes of a certain industry, experience in dealing with various systems, a mature permission and audit plan, or an exclusive method for task acceptance. Its growth flywheel lies in: the more customers use it, the more industry processes, tool connections, and knowledge are precipitated; its growth leverage lies in: whether the cost of serving the second, third, and subsequent similar customers gets lower and lower.

A more realistic path is often to start from a very small but solid business point: the customer profile is clear, the tasks to be done are definite, the results can be measured, and the delivery cost can be controlled. First, prove in a process that customers are willing to pay and the project can make money, and then turn the reusable parts into standard products.

At this point, the role of capital becomes clear: it's not used to verify whether there is a demand, nor to fill the holes in individual customized projects, but to help the proven capabilities be replicated to more customers and more scenarios more quickly.

Money can amplify capabilities, but it can't give you capabilities.

Whether it can enter the customer's business process determines whether the customer will approve the budget. Whether it can make the customer change from "giving it a try" to "being indispensable" determines whether an Agent product can grow into a real Agent company.

Author

Yan Jun, a contributor to SCMP Opinion, former head of government affairs and government - enterprise cooperation at Kuaidi/Didi, and a technology entrepreneur. He has founded and operated a technology company that has received multiple rounds of financing and has long been concerned about the capital judgment, revenue quality, market access, project delivery, and valuation logic of AI, robotics, and hard - tech projects after they enter the real industrial system.