HomeArticle

Interview with Wang Xiao of Niankong Technology: The Path of Large Models for Quantitative Hedge Funds

晓曦2025-05-23 17:33
The quantitative industry sees the light of AI again. Niankong ventures into an international top - tier conference for the first time with fundamental research on large models.

Quantitative funds + large models = ?

Half a year ago, when faced with this arithmetic problem, most people would have answered DeepSeek. However, with the publication of a research paper, a new answer has emerged, which is Niankong Technology.

On May 15th, Niankong Technology, a quantitative private equity firm, submitted a large model research paper co - authored with the School of Computer Science at Shanghai Jiao Tong University to the international top - tier conference NIPS, discussing the "Adaptive Hybrid Training Methodology."

This time, the story is not about a quantitative private equity firm investing heavily in large models and reaping substantial rewards. Instead, Niankong Technology "immersed itself in the game" and achieved research results in the underlying theory of large models, becoming the first Chinese quantitative institution to break into NIPS.

Before Niankong, DeepSeek was the only quantitative private equity firm that incubated and conducted research on the underlying theory of large models and published research results. Compared with this "predecessor," Niankong has taken a step further.

Building on DeepSeek, Niankong proposed a brand - new and superior training method to help large models improve training efficiency. It is a rare and truly innovative research on large models in the quantitative industry.

From a technical perspective, DeepSeek emphasized the importance of reinforcement learning. Wang Xiao, the chairman of Niankong Technology, and his team found that compared with DeepSeek's approach of conducting concentrated SFT (Supervised Fine - Tuning) for a period of time first and then concentrated RL (Reinforcement Learning), alternating between SFT and RL can achieve better training results.

For a long time, the quantitative industry has been known as an incubator for AI. Niankong's technological innovation has achieved letting AI better feed back into the quantitative industry.

Wang Xiao found that "in the past application experience of AI in financial data, financial data has the characteristics of small quantity, low signal - to - noise ratio, and instability. Traditional machine learning and deep - learning algorithms mainly fit the data set, and the fitting effect completely depends on the information content and stability of the data set. This determines that there are huge differences in the application difficulties of AI algorithms such as traditional machine learning and deep learning in financial and Internet data."

Actually, when ChatGPT emerged, Wang Xiao realized early on that large models might be helpful for predicting financial data. "Large models are completely different from traditional machine learning. Large models can bring information beyond the data of sub - tasks themselves and have cross - modal understanding ability."

Not only is there innovation in technology, but the cooperation between Niankong and the School of Computer Science at Shanghai Jiao Tong University is also of great significance for the combination of industry, academia, and research.

The academic community has the pain points of relatively lacking computing power and engineering experience, but it has the advantages of talents, theoretical research, and research directions, and can better focus on breakthroughs in underlying technologies. The industrial community has abundant resources and scenarios to promote the implementation of applications. The two can achieve a win - win situation.

As the most important innovation in the AI era, large models are bound to go through the stages of research "going out" of universities and R & D "going into" universities. We should seize the "wave" of technological innovation and break out of the traditional research model of working in isolation.

A company's genes often determine its future fate. Only by closely connecting with the academic community can there be an opportunity to truly become the "cradle" of the large - model industry. Sowing a small academic "seed" can lead to results in different fields and industries, changing all aspects of ordinary people's lives.

In the quantitative industry, there has not been a mature large - model application. By taking the path of "industry - academia - research," Niankong Technology may become the "first to eat the crab" in large - model quantitative investment.

Wang Xiao told 36Kr that if you want to better apply AI in the financial field, you must understand how the underlying large models operate.

Based on the large - model training experience of Wang Xiao and his team, they realized that the core frameworks of training work in all vertical fields are basically the same, so it is easy to transplant the training framework from one vertical field to another.

One action indirectly proves that Niankong has greater "ambitions" - Niankong Technology has also incubated and established AllMind, whose future main work is to conduct research on the underlying algorithms and engineering technologies of large models while focusing more on vertical applications including but not limited to financial scenarios.

The energy that quantitative analysis can bring to large models may far exceed imagination. Against the backdrop of increasing market attention on the application side, Niankong may take a step forward in promoting the application of large models starting from financial AI.

Only by combining the belief in technology and the primacy of application can China's large - model industry unleash its potential and burst out with vitality in reshaping the global AI competition pattern.

The combination of enterprises and universities can also push forward China's basic large - model capabilities. A major reason is that the data and corpora accumulated by China's industry are unique advantages.

For a quantitative institution, continuously investing in underlying theoretical research is not something with a clear return on investment. It requires long - term, technology - driven investment. Niankong's long - term and firm choice to deeply cultivate this field fully demonstrates its strategic vision.

Since 2019, Niankong Technology has applied the Transformer algorithm to its real - trading product portfolio. This full - chain intelligence has also greatly improved the efficiency of strategy development.

Globally, as the competition in AI large models intensifies, it is necessary for the quantitative industry, which has long - term experience in AI algorithms, to step forward. At the same time, international quantitative giants are still in the exploration, experimentation, and auxiliary stages in large - model investment. Niankong's choice also conforms to market changes and prepares in advance for "overtaking on a curve."

How many possibilities can AI technology bring to quantitative investment? How much potential does Niankong Technology have in exploring large models? With these questions, 36Kr had a conversation with Wang Xiao, the founder of Niankong Technology. The following is a condensed version of the interview:

01

Niankong's foray into large models: A well - prepared battle

36Kr: As a global hedge - fund company, how did Niankong start its AI journey? Why did you publish a paper in the field of large models?

Wang Xiao: As early as 2017, we established an AI team of three people and tried to apply some machine - learning algorithms to financial data.

The first project was on futures. However, the amount of futures data is extremely small. When machine - learning algorithms are applied to samples with a small amount of data, it is easy to over - fit, and good results cannot be obtained.

In 2018, when we applied it to stocks, we found that the effect was very good because the amount of stock data is much larger than that of futures. By 2019, we had converted 90% of our online real - trading models to neural - network algorithms, namely Transformer. In 2021, our scale reached 10 billion, which was also because we had done a good job in the full - process application of machine learning and neural - network algorithms.

2023 was another turning point. When OpenAI emerged, I realized that large models might be helpful for predicting financial data because large models can bring information beyond the data of sub - tasks themselves and have cross - modal understanding ability.

Most of the supervised training in our past application of AI algorithms was to fit historical data, but the core logic of large models is different. So we think it might be another model that can predict the trend of the financial market.

The emergence of DeepSeek this year not only brought about intelligent equality but also revealed the importance of reinforcement learning. Previously, large models mainly focused on supervised fine - tuning in pre - training.

Since we have always had a certain understanding of large models and have accumulated considerable algorithms and computing power, we conducted basic theoretical research on large models this year, which is this paper.

36Kr: It sounds like the research Niankong is doing is very cutting - edge. Can you explain what the "Adaptive Hybrid Training Methodology" different from DeepSeek is?

Wang Xiao: From DeepSeek's training method, we can see that their approach is more like concentrating on doing practice questions (SFT) for a period of time and then concentrating on taking an exam and summarizing the exam results (RL).

Inspired by human learning methods, we found that if we switch frequently between doing practice questions and summarizing exam experience, it may be more beneficial for improving academic performance (reasoning ability).

So we designed a training method that switches between SFT and RL step by step. Before the next step of training, we decide whether to use SFT or RL in the next step according to the adaptive algorithm we designed.

Finally, through experiments, we found that on three different public data sets, the new training framework we proposed was significantly better than single SFT, single RL, and simple mixing of SFT and RL, proving that the new framework we proposed is the current superior post - training method.

36Kr: We noticed that this paper was a collaboration with the School of Computer Science at Shanghai Jiao Tong University. Why did you choose to "co - create" with a university?

Wang Xiao: If you want to use large models for training work in vertical fields, you first need to understand all the training details of large - model training. This is also the starting point of our paper. The reason for collaborating with a university is that the academic community and the industrial community have their own strengths and weaknesses in large - model research. The combination of industry and academia can complement each other's advantages and truly empower basic research in domestic artificial intelligence.

36Kr: The results of this paper are actually very rare and precious because many AI companies have now withdrawn from the research on underlying large models.

Wang Xiao: Yes, but what's more worthy of attention is - what next after withdrawing?

For example, the parameter count of Tongyi Qianwen 3 is only one - third of that of DeepSeek, but its capabilities have already exceeded those of DeepSeek. There will be more and more models with fewer parameters but stronger capabilities, and they are all open - source.

In the future, the greatest competitiveness of most companies will lie in how to make good use of these large models and how to train them better.

A large model is like a general - purpose genius with a very high IQ. However, even such a person cannot "start from scratch" in investment and quantitative analysis without the right methods.

All of this is based on fully understanding the underlying principles of large models. The best way to understand the underlying principles of large models is not to read a thousand papers but to directly start practicing.

36Kr: In the process of collaborating on this paper, were Niankong and the School of Computer Science at Shanghai Jiao Tong University in a "complementary" role?

Wang Xiao: For universities, they have research capabilities but lack resources and computing power. For example, the computing power of many universities is insufficient for large - scale reinforcement - learning training. In addition, they also lack data. We have more engineering experience and computing power, but we may not have the academic experience of writing a paper. The two sides can leverage their strengths and achieve a win - win cooperation.

36Kr: Does Niankong currently have a self - developed large model? Or does it use open - source third - party models?

Wang Xiao: We have a self - developed vertical large model, which is fine - tuned based on Tongyi Qianwen 3. When doing some theoretical research, we basically use Qianwen's model for training and experiments.

02

Starting from financial AI, but not limited to financial AI

36Kr: What is the current scale of Niankong's AI team?

Wang Xiao: The entire team has about dozens of AI engineers. 70% to 80% of them were cultivated from universities by ourselves, and only a small number were recruited from the society.

36Kr: In recruiting large - model engineers, choosing to cultivate them from scratch in schools seems to be a common practice in the industry. It's like cultivating the application ability for the quantitative industry from school.

Wang Xiao: Yes, because our company's internal platform was completely developed by the IT team. So whether it's generating a factor feature or training the model, everything is done within a very standardized and integrated framework.

So if the person has the ability, after interning in our company for more than six months, they can become a proficient employee because they have fully familiarized themselves with our company's research tools.

36Kr: Niankong also established an independent AI company, AllMind. What is the division of labor between AllMind and the internal AI team of the company? Why did you establish AllMind separately?

Wang Xiao: There is a significant division of labor in the work content between AllMind's AI team and Niankong's AI team.

Niankong's AI team mainly uses machine - learning and deep - learning algorithms to fit financial data. The scenarios are relatively vertical, and they are mainly responsible for technical research on specific problems and model optimization.

AllMind's main work focuses more on large models, including optimizing large - model training algorithms, researching engineering technologies, conducting academic exploration in the direction of high - quality CoT data production, and also includes research work in the general field of large models as well as vertical applications in financial scenarios. It hopes to make breakthroughs in basic AI research and radiate to more fields including finance, providing more possibilities and imagination space for the business.

Since Niankong is a quantitative private - equity fund, a profit - making enterprise, while AllMind focuses more on basic academic research and application of large models and does not aim for profit in the short term, and the work content of the two companies is completely different, AllMind was established separately.

36Kr: Regarding the research on other vertical applications not limited to financial scenarios, how does Niankong plan to start?

Wang Xiao: Due to AllMind's certain experience and cognitive accumulation in the algorithms and engineering technologies involved in large models, as well as our post - training experience of SFT and RL on financial data using large models, we realized that the core frameworks of training work in all vertical fields are basically the same. So it is easy to transplant the training framework from one vertical field to another.

For example, all fields need high - quality Prompt and CoT data. They all need to first conduct SFT to enable the model to obtain basic knowledge of a certain field and then conduct reinforcement learning. They all need a correct and efficient Reward Model.

In the short term, AllMind will focus on training a special large model based on financial data and also focus on solving some pain points of current large models, such as improving the logical reasoning ability of large models, reducing the hallucination problem of large models, and exploring whether