Interview with Wu Qiang of Houmo Intelligence: A Thrilling Leap from Scientist to Entrepreneur
If you visited Hall H1 at the just - concluded WAIC 2025 (World Artificial Intelligence Conference), you'd find that the two hottest topics in China's computing power field this year are none other than "super - nodes" and edge - side AI chips.
This is sufficient evidence that as large AI models have developed vigorously to date, computing power is showing a trend of bipolar development.
On the one hand, large - model training has become a necessity, and cloud - based computing power still needs continuous expansion. "Super - nodes" support the continuous upward exploration of domestic cloud - based computing power. At WAIC, Huawei and several cloud - side AI chip companies showcased astonishing computing "super - nodes".
On the other hand, as AI continues to be implemented in various industries, especially when Deepseek became extremely popular, greatly reducing the computing requirements of generative AI and driving the implementation of large - model inference applications, especially the popularity of edge - side applications. At WAIC, several companies displayed compact edge - side AI chips and their related intelligent hardware.
It can be said that future generative AI computing will be a perfect combination of cloud computing and edge computing. Wu Qiang, the CEO of Houmo Intelligence, believes that in future generative AI inference computing, about 90% of data processing will be carried out at the edge and side, and only about 10% of complex inference tasks will need to go to the cloud. Only in this way can AI truly be popularized, enter thousands of households, and be everywhere.
In the past two years, cloud - based opportunities have multiplied NVIDIA's market value by six times, with the company's value exceeding four trillion US dollars, making it the biggest beneficiary of this wave of AI. However, the blue - ocean market on the edge - side is slowly opening its curtain, and the future edge - side market is also breeding the opportunity for the "next NVIDIA". Moreover, the edge - side market may be larger than the cloud market and can accommodate more players.
Houmo Intelligence is one of the remarkable companies. Wu Qiang, the founder and CEO of Houmo Intelligence, has in - depth research background in high - energy - efficiency chips and distributed computing. He worked at Intel, AMD, and Facebook earlier and then returned to China to join Horizon Robotics. More than four years ago, he started a chip - related business and founded Houmo Intelligence. Their current strategy is clear: they hope to use in - memory computing as a spear to penetrate the last mile of edge - side large - model computing.
However, why choose in - memory computing as the core technology to enter the edge - side chip market? What can in - memory computing bring to AI computing?
As many people have noticed, the "memory wall" and "power consumption wall" problems of the classic von Neumann architecture have become increasingly serious. Especially as the number of parameters in large models continues to expand, reaching tens or even hundreds of billions, the energy consumption of data transfer is likely to exceed that of the computing itself. In this regard, some industry insiders joked that "the end of future AI is energy". In - memory computing is a fundamental solution that directly performs matrix multiplication and addition operations in the storage unit, eliminating the data transfer process.
Just before WAIC 2025, Houmo Intelligence released the edge - side chip product "Houmo Manjie M50" based on in - memory computing. According to Wu Qiang, the biggest feature of M50 is that it has achieved an innovative breakthrough in the computing architecture through self - developed in - memory computing technology. For example, it uses a new - generation underlying in - memory computing IP, which greatly improves both energy efficiency and area efficiency.
Wu Qiang, the founder and CEO of Houmo Intelligence, launches Houmo Manjie M50 on - site
In addition, at the level of the AI processor architecture, M50 uses the self - developed new - generation "Tianxuan Architecture" IPU, which enables floating - point models to run directly in the in - memory computing architecture, improving application efficiency. At the same time, to lower the threshold for customers, M50 is also equipped with a new - generation compiler toolchain "Houmo Dadao", which is easy to use, supports mainstream deep - learning frameworks, and allows customers to seamlessly adapt and migrate chips.
With the support of various technological innovations, the performance of M50 is also outstanding. It achieves a physical computing power of 160 TOPS@INT8 and 100 TFLOPS@bFP16, is equipped with a maximum of 48GB of memory and an ultra - high bandwidth of 153.6GB/s, and has a typical power consumption of only 10W, equivalent to the power of a mobile phone fast charger. This performance indicator means that intelligent mobile terminals such as tablets/PCs, intelligent voice devices, and robots can efficiently run local large models with 7B to 70B parameters without relying on the cloud.
Houmo Manjie M50
Wu Qiang said that the characteristics of edge - side AI are dispersion and extremeness. Therefore, for the enablers of edge - side large - model scenarios, the M50 series also adopts the original solution + AI model, which is compatible with mainstream processor architectures such as X86 and ARM to meet diverse edge - side needs.
After the product was finalized, the commercialization of Houmo Intelligence is also progressing rapidly. Wu Qiang said that they currently have several benchmark potential customers, including Lenovo's AI PC products, iFlytek's intelligent voice devices, and China Mobile's 5G + AI implementation.
During the nearly two - hour interview, Wu Qiang, the founder of Houmo Intelligence, was very candid. He shared with us the process of founding Houmo Intelligence, revealed how he made the perilous leap from a scientist to an entrepreneur, and shared his views on the technological opportunities of edge - side chips and in - memory computing in the context of the AI boom.
The following is a summary of the conversation between 36Kr and Wu Qiang, the founder and CEO of Houmo Intelligence:
01. Painful Transformation, Unexpectedly Meeting the Large - Model Windfall
36Kr: Houmo's first - generation products were more focused on the intelligent driving market, and now they are more about the application of general edge - side large models. What are the considerations and strategies behind the transformation from intelligent driving to consumer terminals like AI PCs?
Wu Qiang: From the very beginning of starting the business, we decided to use in - memory computing technology to make more efficient AI chips, and this direction has never changed. However, in which scenarios these AI chips should be applied? This is something we have been exploring and have also made changes in the process. When we first started the business, around the beginning of 2021, based on our previous background and experience, we chose to develop intelligent driving chips. At that time, we saw that with Tesla's "software - defined cars" cultivating users' awareness, there was a huge market opportunity for intelligent driving in China.
However, after developing the first - generation products, in the second half of 2023, we felt that this path was becoming unfeasible. On the one hand, the market was highly competitive, and the pattern of giants and early entrants was gradually taking shape, leaving fewer and fewer opportunities for new players. On the other hand, there was a major flaw in the definition of our first - generation products.
At that time, to demonstrate the technological and energy - efficiency advantages of in - memory computing, we designed our first - generation chips with very high computing power (256T physical computing power, up to 512T in sparse computing power). However, high computing power means high cost, which was inconsistent with the market demand for intelligent driving chips in 2023. In the second half of 2023, the market was focused on price competition, with the price of intelligent driving systems dropping to as low as a thousand yuan, and people saying that L3 would never come (it would always be L2+++++), and there was no need for high - computing - power chips.
So our computing power was too advanced and redundant. Moreover, as a new player, it was difficult for others to adapt to our software system, making it hard to enter the market. We also tried to reduce the computing power and improve the cost - performance ratio in the second - generation products, but we felt that the window of opportunity for entering the intelligent driving chip market was getting narrower, and by the time the second - generation products were launched, the market window might have closed.
If we were sure that the path was unfeasible, we needed to make a change. However, change is a very painful thing. On the one hand, at that time, the development of the second - generation intelligent driving chips was already half - done, and it was a great pity and pain for the R & D team to abandon it, as all the previous work would go to waste. I was also very conflicted and pained, and worried that if we made a change, we might be seen as uncommitted and deserters in the industry. However, in the end, the pressure of survival overcame concerns about face, and we resolutely decided to make a change and transform.
The next question was, what to transform into? Since 2023, I have started to pay attention to large models and conducted extensive research on the technology and market of large models with the team. We began to realize that large models are applications that require both high computing power and high bandwidth, which is in line with the in - memory computing technology route because in - memory computing solves the problems of data and computing.
Moreover, large - model computing is also penetrating from the cloud to the edge - side. Perhaps doing general edge - side large - model computing is an opportunity suitable for Houmo. After having this understanding, in early 2024, we quickly adjusted a version of the first - generation chips and launched the M30, making some cuts and optimizations for large models.
Our first public appearance was at the China Mobile Barcelona Exhibition in early 2024. At that time, we used the M30 to run the Zhipu large model with 6 billion parameters and found that the results were quite good, which gave us a lot of confidence. Our shareholder, China Mobile, also encouraged us to explore general edge - side large - model computing more. Considering all these factors, we firmly decided to transform into the direction of general edge - side large - model AI chips. After the transformation, the team worked very hard and launched our M50 in more than a year.
36Kr: You have several potential customers, including Lenovo, iFlytek, and China Mobile. Which scenarios will you focus on expanding into in the future?
Wu Qiang: We are developing general edge - side large - model AI chips. Currently, we are focusing on several application areas. One is consumer terminals such as tablets and computers, where large models are very useful as productivity tools.
The second is intelligent voice systems, and large - model voice/conference applications are also an area we are focusing on. The third is the edge computing of operators, and 5G + AI is a trend. China Mobile invested in us at that time, and there is a large market space for 5G + AI.
Of course, our energy is limited, so we will focus on these areas first. For other areas, as long as they are at the edge - side, have a need for large models, and are sensitive to power consumption, they may become our customers, and we will gradually expand into these areas. Currently, the major directions are consumer terminals, intelligent office, intelligent industry, and robotics.
36Kr: You mentioned that you also explore the market. What are the market characteristics of edge - side chips?
Wu Qiang: They are sensitive to cost and power consumption, and the products need to be small, not large cards. In addition, they also have high requirements for heat dissipation, and the edge - side scenarios are very extreme.
36Kr: Your previous research field was high - energy - efficiency chips, and now you have chosen the in - memory computing direction for your startup. Coincidentally, in - memory computing is naturally compatible with large - model computing, and you happened to meet the opportunity of large models during the transformation. I feel that many of your past experiences seem to be preparations for this moment.
Wu Qiang: Yes, perhaps everything was predestined for this moment. The country and the industry have given us this opportunity, and we have seized the new opportunity brought by large models. Looking back now, we transformed relatively early, although it felt extremely painful at that time. I think we just need to make good arrangements, be well - prepared, and wait for the opportunity to come.
02. In - Memory Computing: From Obscurity to Flourishing
36Kr: In - memory computing technology is very cutting - edge. What are the consensuses and non - consensuses at present? What stage is the industry in?
Wu Qiang: The in - memory computing technology direction has changed a lot compared to four years ago when I first entered the field.
First, more and more mainstream AI chip companies are talking about in - memory computing. Many listed AI chip companies or unicorn companies are saying that they are going to layout the next - generation chip architecture, in - memory computing, and subvert the von Neumann architecture. Four years ago, this was not the case. At that time, among mainstream chip companies, only some storage companies like Samsung Semiconductor were talking about in - memory computing.
Second, the state and the government have also begun to attach importance to in - memory computing and regard it, along with optoelectronic computing, as a new - generation chip technology direction. In the past year, we have participated in several closed - door discussions organized by relevant departments of the National Development and Reform Commission and the Ministry of Industry and Information Technology.
In addition, many investment institutions now have a relatively in - depth understanding of in - memory computing. Unlike four years ago, in - memory computing was a niche concept, and only a few investment institutions had a deep understanding of it.
I think the value of in - memory computing for AI computing has gradually become a consensus. However, how to implement in - memory computing and how to productize it are still in an exploratory stage in the industry, and people's understandings are quite different.
For example, in the past, more focus was on in - memory computing with low computing power, while now more attention is on high - computing - power in - memory computing. There are also many non - consensuses on the storage media used for in - memory computing, including Nor - flash, SRAM, DRAM, and RRAM.
In summary, the in - memory computing industry is now in a stage of diverse development and market - grabbing. The key lies in who can launch a truly useful product with high energy efficiency and area efficiency. Compared with other competitors, Houmo focuses on in - memory computing using SRAM and DRAM to achieve high - precision and high - computing - power in - memory computing. Our company was the first and has gone the furthest in SRAM - CIM high - computing - power in - memory computing. We have also been laying out DRAM - PIM in - memory computing for more than a year.
36Kr: In - memory computing technology has advantages, but as an innovative architecture, what challenges does it face in the process of productization?
Wu Qiang: We have been working on the implementation of in - memory computing technology for four years. There is a huge gap between the academic research of in - memory computing technology and its productization and market entry. The most difficult bottlenecks are as follows:
First is the circuit design aspect. Academic research focuses on the technical feasibility, proving that it is theoretically feasible. However, as a product, there need to be many breakthroughs in circuit design to make a product - level design that can meet the large computing power, high precision, reliability, etc. required by real - world application scenarios. This requires a lot of design innovation at the application level based on existing academic research.
Second are the many engineering problems faced in product mass - production. Mass - producing chip products requires solving problems such as testability and yield. These all require us to expand on existing traditional EDA tools and develop design tools related to in - memory computing. Over the past four years, we have developed a corresponding technical solution, including MBIST/CBIST, etc., which has been verified in actual tape - outs.