HomeArticle

Farewell to the "seven-step marathon" of scientific research. An AI agent is rewriting the way knowledge is produced.

36氪品牌2026-03-24 09:31
What would happen if the entire scientific research process were handed over to an intelligent agent for execution?

Research has long been romanticized.

It is not a unified action but a highly segmented assembly line. Retrieval, screening, reading, organization, hypothesis, experiment, verification, and then back to writing and publication - these seven steps almost constitute the basic path of all academic work.

The problem is that each step in this path consumes time, but not all of them create value.

Literature retrieval often means screening dozens of truly relevant papers from thousands. During the reading phase, one needs to understand the methods and conclusions of each paper and build an unstable cognitive structure in the mind. Only when defining the problem does the researcher start the "creative" part, which often consumes a great deal of time.

These steps, in essence, belong to "deterministic labor." They can be disassembled, described, and repeatedly executed, but still rely on manual work. This creates a typical mismatch: The most precious cognitive resources are largely consumed in the most easily replaceable parts.

In the past decade, AI has indeed entered the field of research, but mostly remained on the periphery. It helps people find papers faster, translate texts more smoothly, and even write a well - structured review. However, these capabilities have not changed the basic form of research. Research is still a "seven - step marathon," just running a little faster.

A more radical proposition has emerged: What would happen if the entire research process were handed over to an intelligent agent?

The recently upgraded AI academic intelligent agent, Qiewen Academic (the Chinese version of WisPaper), presents a new possibility. It entrusts deterministic labor to computing power and returns the uncertain inspiration to humans. Behind this statement lies a whole set of re - distribution of research production methods.

AI doesn't produce papers but accelerates the process

In the traditional research process, the seven - step research marathon requires people to switch and connect between different steps repeatedly.

The emergence of Qiewen Academic doesn't mean it can directly produce papers. It is more about being embedded in the research process, becoming a comprehensive ability. Given a research task, the system can start from literature retrieval, complete reading, analysis, and information structuring. On this basis, it can identify potential problems, further enter experimental design and execution, and finally output results and reports.

The role of AI in research has thus changed. In the past, AI was more like an "assistant," providing advice or helping with certain parts of the work. These tasks were mostly local, such as translating a paper, summarizing a passage, or completing a piece of code. Researchers needed to continuously take over the process, switch between different tasks, and maintain the overall progress.

The inevitable machine hallucinations make this part of the work require re - review to avoid potential academic fraud and falsification risks. As an AI intelligent agent, Qiewen Academic is more like an "executor." It can autonomously complete some content without continuous human intervention. This means that for the first time, there is a possibility of "entrusting" the research process.

To make a more intuitive analogy, it is a bit like autonomous driving. In the autonomous driving system, humans are responsible for setting the goal, and the system is responsible for the path and execution.

The same logic is being introduced into research, and a similar division of labor is emerging. Researchers define the problem, and Qiewen Academic, as an intelligent agent, is responsible for the process.

The changes in the research process start to become apparent here.

Firstly, the ownership of the process is re - divided. Tasks that originally needed to be completed step - by - step by humans are integrated into a process that can be taken over by the system as a whole. Steps such as retrieval, reading, and organization, which originally highly relied on manual work, no longer require individual intervention but are continuously processed under the same logic.

Secondly, the research work mode shifts from a serial process to a parallel structure. After the intervention of such AI intelligent agents, research no longer has to proceed along a single path. Multiple hypotheses can be developed simultaneously, and multiple directions can be verified in parallel. A researcher's work mode changes from solving one problem to managing a set of problems.

As the process itself begins to be re - organized, the rhythm of research also changes.

A 100 - fold speed engine, the first "generational gap" in research

In terms of product capabilities, the first thing Qiewen Academic does is to "decouple" the chain of the traditional research path. In the traditional path, there are waiting and switching costs between each step. Qiewen Academic brings exponential efficiency improvement.

This change is tangible. According to its public information, compared with traditional manual work, the AI4S model of Qiewen Academic is estimated to improve efficiency by 10 to 100 times in literature retrieval. The literature screening that originally took weeks is compressed to minutes; paper reading efficiency is improved by 20 times, and the reading and organization that originally took months are compressed to hours for structured extraction; problem identification can be systematically scanned and located in the entire data, with a 50 - fold speed increase. Such a significant efficiency improvement can almost reshape the research lifecycle.

Meanwhile, the efficiency improvement of Qiewen Academic is based on effectiveness and reliability. According to the published data, the accuracy of Qiewen Academic's literature search reaches 93.78%, while the mainstream models are generally around 70%; the accuracy of document layout analysis, formula analysis, and table analysis is all above 90%, which is higher than the industry average.

These capabilities do not directly produce conclusions but determine the form in which information enters subsequent processing. Variable relationships, experimental structures, and data distributions are disassembled in advance, and reading changes from processing papers one by one to structured reception.

Especially in the test, the review consistency of Qiewen Academic reaches 22.26%, and the citation authenticity is close to 99.8%. The former determines whether information from different sources can be incorporated into the same logical framework, and the latter takes a big step in eliminating the machine hallucinations of generative models.

It is on this basis that the value of its embedding in the research process is established.

A highlight of this upgrade is the in - depth involvement in aspects such as experiments. By uploading a paper, the system automatically completes reading and understanding, disassembles the core tasks and algorithm logic; on this basis, it analyzes the experimental methods and generates an executable experimental plan; then it automatically builds the computing environment, including computing power configuration and dependencies; generates code and executes the experimental process, and finally outputs the results and a complete experimental report.

The entire process does not require manual step - by - step intervention. Qiewen Academic can automatically generate an experimental path based on existing literature or research gaps identified by the system, autonomously match or find data, complete environment setup, execute experiments, and output results.

In the traditional research process, "cognition" and "execution" are separated. Understanding can be accelerated, but verification still relies on humans. Now, the entire process that originally required humans to switch and make repeated trials and errors is overall accelerated. The research process changes from being "human - driven" to "intelligence - driven."

In this sense, it may represent a generational change in research efficiency.

All of this cannot be achieved by general large - scale models. For example, in terms of learning ability, traditional large - scale models may be good at exams, but they have learning difficulties with new knowledge they have never seen before. In the CL - bench test, large - scale models need to understand a completely unfamiliar set of rules and apply them immediately in the context. Most models fail in this step, with an average success rate of only 17.2%.

The research scenario precisely relies on this ability. Every problem is new. Only when the model can quickly establish an understanding of rules in the context does it have the basis to enter the research process.

Therefore, Qiewen Academic has made targeted optimizations on how to advance tasks in a real - world environment. Its training method of AgentGym - RL creates an environment closer to real - world research. The model needs to continuously adjust the path in tasks such as web operations and experimental processes. Execution cannot rely on preset answers but needs to be continuously corrected through feedback.

According to relevant papers, a small model with only 7 billion parameters (Llama - 3.1 - 8B), after AgentGym - RL training, achieved performance comparable to or even better than GPT - 4o and Claude 3.5 - Sonnet in multiple scenarios.

At the same time, during the training process, it assigns higher weights to Tokens related to key capabilities such as reasoning and code, aligning the improvement of capabilities with training indicators.

However, even with the ability to advance tasks in a real - world environment, this is still not that simple. For the model to truly enter the research process, it also needs to solve a more hidden problem: the stability of the training itself.

RLHF is almost the core path for the alignment ability of all large - scale models. However, this method has a well - known difficulty: PPO training is extremely unstable. This is why many models perform well in short tasks but start to show uncontrollable deviations once they enter complex processes.

Qiewen Academic uses PPO - max with more granular constraints and reward mechanisms to keep the training process stable, no longer relying on luck.

After achieving stability, there is execution. Invoking tools, writing code, and handling environmental dependencies are full of uncertainties. Traditional models often rely on templates in this part or only stay at the level of "generating code." Once they enter the real execution environment, deviations will occur.

In the research environment, information is not always consistent. There may be conflicts in conclusions between different papers, and data sources may also cause deviations. If the model simply integrates information, it is easy to distort in multi - source information.

When Qiewen Academic encounters inconsistencies between "existing memory" and "current input," it forms two internal processing paths and finally makes a choice based on different signal intensities. This enables the model to have basic judgment ability in a complex literature environment instead of passively accepting information.

When these capabilities are aggregated, the change is no longer a local improvement. It represents a real paradigm shift in the research production method.

When research returns to "humans," the critical point of accelerated breakthrough

In this change, what is changed is not just efficiency.

The research work mode starts to shift from personally completing every step to making judgments at key nodes. When the execution is taken over by the system, researchers no longer need to repeatedly enter those deterministic processes. Instead, they gradually withdraw from specific operations and stand at a higher level to understand problems, choose paths, and examine results.

This change seems subtle but is quietly rewriting the division of labor in research. The smartest minds no longer need to run in the process. They shift from a role closer to an executor to an architect or a leader.

At the same time, another invisible threshold is disappearing. In many fields, there is a gap between ideas and results in terms of code, computing power, and experimental environment. Once these requirements are solved by intelligent agents like Qiewen Academic, the entry threshold for research will be re - defined.

As a result, the research competition starts to move forward. It changes from who can achieve results to who can see the problems earlier, returning to the "humans" who define the problems. Some researchers who were originally limited by technical conditions can also more directly participate in the problems themselves.

The essence of research is knowledge production. When the cycle of knowledge production is compressed, it affects the rhythm of the entire technology system. In addition to the decrease in time cost, the update frequency of the knowledge base is also accelerating. In fields such as new materials, targeted drugs, and clean energy, which are limited by verification costs, once the verification is compressed, the path screening will be significantly accelerated, wrong directions will be eliminated earlier, and feasible paths will emerge faster.

This means that research will continuously approach the answer through high - density exploration. The trial - and - error process that originally took years to accumulate will occur repeatedly in a shorter cycle.

The way of technological breakthroughs also changes, shifting from accidental discoveries relying on individual experience to gradual convergence through high - frequency verification.

As this rhythm continues to accumulate, a state closer to the critical point begins to emerge. Research enters a new division of labor structure: AI is responsible for advancing known paths and continuously compressing the deterministic parts; while humans stay in the unknown area to judge which problems are worth further exploration.

This article is from the WeChat official account "Intelligent Emergence", published by 36