Has the Inflection Point of the Era Arrived? What If AGI Arrives in the Next Two or Three Years?

In the short term, AI acts as a substitute; in the medium term, it serves as an accelerator; in the long term, it brings about expansion.

If three months ago, when talking about "AGI in the true sense", "complete self - iteration of models", and "fully integrating AI into workflows", the expectations of many leading researchers were still rather conservative. The mainstream view was that it would take 5 to 10 years for these major inflection points to arrive. But today, many of them would revise their answers to this question to 2 to 3 years.

Three months ago, many people were worried that the industry bubble would burst before AI truly developed.

Now, people's concerns have shifted to: Will AI develop too fast? Will I be replaced?

Why is the pace of AI development being re - evaluated at this time? Because something is happening: The fourth scaling law of artificial intelligence, agent scaling, is on the rise, showing a steep upward trend.

In the past, people have witnessed three scaling laws:

The first is the pretraining scaling law represented by GPT.

The second is the post - training and reinforcement learning scaling law (RL scaling law) represented by OpenAI's O series.

The third is the scaling of test - time compute during the scale inference stage, such as the work done by Google's Deep Think.

Many people believe that the scaling of the autoregressive technology based on next token prediction is approaching its limit, and the industry needs a new paradigm. However, when we see the performance of agents in fields such as programming, we can realize that the existing paradigm has far from been fully utilized. The fourth scaling law based on agents is evolving and developing at a speed that exceeds many people's expectations.

What's really happening today is not that scaling has failed, but that the object of scaling has changed. In the past, scaling was more about data + parameters + computing power. Now, what deserves more attention is agents + systems + feedback loops.

The industry change that has shifted the development trajectory of artificial intelligence occurred in the past three months.

What happened in the past three months?

Before talking about the past three months, let's cast our eyes further back.

In 2017, the Transformer neural network emerged, essentially providing a unified architecture for the entire artificial intelligence industry that allows for continuous scaling. It made the industry truly understand for the first time that many different tasks can be put into a unified paradigm, and the upper limit of capabilities can be continuously raised through large - scale training.

In 2018, GPT and BERT took two typical routes. The former is more focused on generation, while the latter is more focused on understanding.

Later, GPT's route went further. This is not because it can definitely outperform BERT on a single - point benchmark, but because it is more suitable as a general interface to accommodate diverse human behaviors. When humans interact with the world, they are not always doing classification and extraction. Instead, they often need to issue instructions, set goals, and then generate, modify, plan, and execute. From this perspective, GPT's route has already laid the foundation for the agent stage.

Later, GPT - 2 and GPT - 3 were released. At this stage, the most important thing is that people truly saw for the first time that scaling can bring about the emergence of general capabilities.

In 2022, ChatGPT and RLHF brought about a major inflection point. From that moment on, the model was no longer just "able to continue writing", but more clearly "able to do things according to human intentions". This step was particularly crucial as it transformed the model from a statistical "token predictor" into a work interface.

Later, the reasoning models represented by o1 brought the industry's focus back to high - value tasks such as complex reasoning in mathematics and programming. The message it conveys is very clear: not all tokens are of equal value. The industry needs tokens that can continuously output in high - value problems and maintain consistency in complex tasks.

Artificial intelligence has now entered a new stage: Agent Execution. The model is no longer satisfied with just answering users' questions but has started to truly help users with tasks.

If many leading researchers in Silicon Valley still thought that the industry needed the next paradigm last second half of the year, perhaps continual learning or online learning, then today, more and more people are starting to think that we may not need to wait for a brand - new paradigm. It may be sufficient to scale up agent execution based on the existing paradigm.

In the past three months, what the industry has shown is not just "a few more points on the model benchmark", but that AI has truly started working. It is transforming from a tool into an agent.

Previously, AI gave the impression that when a user asked a question, it provided an answer. In essence, it was still a responsive system, a question - answering system, a passive system. But now, AI's performance is becoming more natural, more accurate, and more human - like. It is like a colleague or even a team. In the past, AI was like a database to be called or an exhibit. Today, it is becoming more and more like an entity that can be entrusted with work tasks, a production unit.

In the past three months, many previously scattered clues suddenly started to form a closed - loop.

As the industry has reached the agent execution stage in terms of the technical path, on the product level, Claude Opus 4.6, GPT - 5.3 - Codex, and GPT - 5.4 have also shifted the industry's focus to programming, agents, and long - running tasks during this period. As a result, many teams in Silicon Valley have truly entered a work mode of "one person leading ten or twenty agents to work". Many companies have started to systematically require employees to write structured skill documents to "raise" their own agents. People are starting to study harness engineering at the engineering level: how to build an environment, how to construct a feedback closed - loop, and how to make agents stronger in real tasks.

Finally, this entire set of logic will boil down to many practical issues such as entrepreneurial directions, organizational models, human resource decisions, investment opportunities, multimodal and robot technologies.

In these crucial three months, what we have seen is not a single - point technological breakthrough, not "a certain model suddenly becoming a little smarter", but that the entire logical chain of the industry has become more complete.

If we want to further focus on these three months, perhaps we can lock our eyes on the three products mentioned earlier:

On February 5, 2026, Anthropic released Claude Opus 4.6. On the same day, OpenAI released GPT - 5.3 - Codex. One month later, on March 5, 2026, OpenAI released GPT - 5.4.

Their common feature is that if we only look at the SWE - Bench list, their scores may only increase by 1 - 2 points, but at the overall model level, they have demonstrated groundbreaking agent capabilities.

Looking at the release of these products now, we can at least conclude three things:

First, the model iteration rhythm has changed from "yearly" to "monthly".

Second, the focus publicly emphasized by leading laboratories is no longer just the amount of knowledge and chat experience, but programming, agents, toolkits, and professional - level work.

Third, products and actual workflows are maturing in sync. This means that the existing technology is no longer just a concept in the laboratory but has truly started to enter real - world production units.

AI has chosen agents.

Agents have chosen programming.

Currently, the field where the fourth scaling law is most evident is agentic coding.

In the past, AI coding assistants helped users complete code after users wrote it.

Today, in agentic coding, users give it not just a question but a goal. It will break down tasks, select tools, read documents, write code, run tests, check for errors, fix bugs, and finally return the results on its own.

The difference between the two is not a 20% or 30% performance improvement but a qualitative change in roles. In the past, the user was the main programmer, and AI was the assistant. Now, the user is becoming more like a project manager, and AI is becoming more like an engineering team. To use a not - so - strict but intuitive analogy: today's agentic coding is starting to approach the L4 level in autonomous driving. It doesn't mean there is no human participation at all, but rather that humans no longer need to be involved in every step and no longer need to drive every kilometer themselves. Users don't need to hold the steering wheel all the time; they just need to sit in the driver's seat.

The formula for productivity has changed. In the past, an engineer's output was roughly equal to: time × ability. Now, a coefficient has been added to the formula: time × ability × the number of agents. AI doesn't just give people "higher efficiency" but the ability to scale. What users get is not a "faster self" or a "smarter self" but an "amplified self".

OpenAI showed a very representative case in its developer blog in February 2026: GPT - 5.3 - Codex ran continuously on a blank repo for about 25 hours, consuming about 13 million tokens and generating about 30,000 lines of code. This is obviously beyond the reach of an ordinary human engineer. This case means that AI is not only increasing the average IQ of the world but also modifying the scale of time. It is better at collaborating with time to produce more significant scaling effects.

The reason why agentic coding is naturally suitable for creating a new scaling law is that it meets several very critical conditions:

First, agentic coding has very direct verifiable rewards. Whether the code can run, whether the tests can pass, whether the bugs have been fixed, and whether the results meet the specifications, all these feedbacks come very directly. There is no need for human preference scoring with vague standards, nor for particularly complex subjective alignment.

Second, the data for agentic coding can be naturally synthesized in large quantities. There has not been enough programming data in human history about "AI writing code, making mistakes, fixing them, and iterating on its own in an environment". Once an agent learns to run tasks in an environment, it can generate new trajectory data on its own.

Third, agentic coding naturally supports self - iteration and closed - loop reinforcement. Agents execute tasks in the environment, receive feedback, and correct themselves. It is not a one - time prediction but a continuous generation of new training signals. This means that the system will become stronger and stronger, rather than relying only on external static corpora.

Looking at these three conditions together, agentic coding is essentially constructing a verifiable, synthesizable, and self - reinforcing intelligence engine.

This is not in opposition to the scaling law. Instead, it is a great victory and a restart of the law in a new dimension.

So, what are humans doing at this moment?

Driven by the powerful force of agentic coding, many leading laboratories and the most forward - thinking entrepreneurs have started to adopt a work mode in real - world scenarios where one person leads an "agent team". The work to be done is no longer to write every line of code personally but to assign tasks, review results, adjust directions, and control the rhythm.

The real change lies not only in the improvement of efficiency but also in the change of the smallest production unit. In the past, the smallest production unit was an employee. Now, it is a person + a group of agents. This work mode easily reminds people of a popular concept this year: shrimp farming.

This extremely vivid metaphor accurately describes the actual state of many teams today: everyone is training, managing, and expanding their own "shrimp swarm" - a group of agents. It's not just about calling a model but about raising a group of little things that can do work for oneself. Provide them with context, skills, tools, and tasks to run.

Many teams are doing one thing: asking employees to systematically write down their knowledge, experience, and work processes to form a SKILL.md equivalent to a "skill package", so as to tell the agents what skills are required for this work, when to trigger it, and how to execute it.

该文观点仅代表作者本人，36氪平台仅提供信息存储空间服务。

Has the inflection point of the era arrived? What if AGI arrives in the next two or three years...