NVIDIA launches a general deep research system that can connect to any LLM and supports personal customization.
NVIDIA is also developing a deep research agent.
The latest paper introduces NVIDIA's Universal Deep Research (UDR) system, which supports personal customization and can be connected to any large language model (LLM).
This means it can operate around any language model, allowing users to fully customize their deep research strategies and have them implemented by the agent.
To demonstrate its versatility, NVIDIA has also equipped the UDR with a research demonstration prototype with a user interface, which can be downloaded on GitHub.
Netizens believe that it has achieved a breakthrough in the autonomy of agents and is very suitable for corporate work.
Built - in models and strategies
The paper states that all previously launched deep research agents use hard - coded methods and can only execute specific research strategies through fixed tool selections.
In contrast, NVIDIA's UDR system can operate around any LLM.
It also enables users to create, edit, and optimize their fully customized deep research strategies without additional training or fine - tuning.
The above figure shows the components of a typical deep research tool (DRT). Different from ordinary conversational LLMs, DRTs tend to continuously update users on their progress before generating reports.
A DRT consists of two parts:
A simple user interface: used to receive research prompts, continuously update users on the research progress, and display research reports;
Agent logic: code agents (coordinating the combined use of large language models and tools through code) or LLM agents (directly leveraging the model's own reasoning and tool - calling capabilities).
Whether it's Gemini, Perplexity, or OpenAI, existing DRTs mainly adopt rigid research strategies, leaving little room for user customization except for research prompts. In DRTs with LLM agents, there are often problems such as a single choice of underlying models or the use of models from the same series with the same behavioral characteristics after training.
Although this issue is not an obstacle to the widespread popularity of DRTs, it limits their practicality in three aspects:
1. Users cannot independently set resource priorities, automatically verify the authority of information, or control search costs.
2. Existing DRTs cannot produce professional document analysis solutions required by high - value industries.
3. The models used in existing DRTs are non - replaceable - users cannot freely combine the latest or most powerful models with deep research agents to create a more powerful DRT.
NVIDIA's UDR system proposes a general solution to the above problems.
Simply put, different from specialized DRTs, UDR receives research strategies and research prompts from users, allowing a higher degree of customization.
UDR can compile strategies from natural language into executable research orchestration code snippets, then execute the strategies, and deliver the final report to the user.
Its most significant innovative features include:
Customize research strategies through natural language. UDR supports users to define and program their own research workflows in natural language, and the system will convert them into executable and auditable code.
This means that the intelligent operation processes designed by users can be directly put into practical use without retraining the AI model or performing complex debugging.
Research tool architecture independent of models. UDR decouples research logic from language models, enabling developers to encapsulate any large language model - regardless of the vendor or architecture - into a fully functional deep research tool.
In this way, there is more room for product design: both the most advanced AI models can be selected, and customized research solutions can be paired to achieve innovative applications with flexible combinations.
User - controllable strategy - driven research interface. The prototype in the following figure demonstrates four practical functions: real - time modification of research strategies, selection of preset strategy libraries, receipt of progress notifications, and viewing of analysis reports.
UDR improves computational efficiency by distinguishing control logic from language model reasoning. The scheduling of the entire deep research process is fully responsible for by the generated code, which runs directly on the CPU, avoiding the language model reasoning overhead that is dozens of times more costly.
The system only calls the LLM when explicitly required by the user - defined research strategy, and each call only processes the streamlined and targeted text fragments stored in code variables.
This dual - efficient design - delegating process scheduling to the CPU execution logic while strictly limiting the use of the LLM to precise and efficient calls - can not only reduce GPU resource consumption but also significantly reduce the overall execution latency and cost of deep research tasks.
Needs further exploration
However, this work currently has certain limitations.
On the one hand, the accuracy of the UDR system in executing research strategies completely depends on the quality of the code generated by the underlying AI model. Although researchers have reduced errors by requiring code to have comments, when the strategy is vaguely or insufficiently specified, the system may still occasionally have misunderstandings or logical errors.
On the other hand, the UDR assumes that the research strategies designed by users are reasonable and executable. The system only performs basic checks and does not judge whether the strategy steps are truly effective. If the strategy is poorly designed, the final generated report may be of low quality, incomplete, or not generated at all.
In addition, although the UDR displays the research progress in real - time, the current version does not support user intervention during the execution process (only task cancellation is allowed) and cannot adjust the research direction based on real - time feedback.
All decisions need to be preset before the research starts, which makes long - term or exploratory research tasks lack flexibility.
In response to the above problems, researchers have also proposed further solutions - or rather, improvement plans:
For example, equipping a modifiable and customizable research strategy library, further exploring how to allow users to control the free reasoning process of language models, and automatically converting a large number of user prompts into intelligent agents with deterministic control.
Currently, NVIDIA's UDR system is only in the prototype stage and has not been officially launched, but perhaps we can look forward to it.
We look forward to a fully functional official version.
Reference links:
[1]https://x.com/rohanpaul_ai/status/1964689864244203596
[2]https://research.nvidia.com/labs/lpr/udr/
[3]https://github.com/NVlabs/UniversalDeepResearch
This article is from the WeChat official account "QbitAI", author: Buyuan. 36Kr is authorized to publish it.