Roundtable Discussion: Which Sector Will the Next Killer AI Product Emerge From? | 2026 AI Partner · Beijing Yizhuang AI + Industry Conference
What will be the next killer AI product? Will it be smart glasses, an AI agent, or an unnamed "physical world gateway"? There is no definite answer in this round - table discussion, but a consensus has been reached: applications that simply wrap large - language models are doomed to be short - lived. Only products that are continuously online, connected to the physical world, and capable of completing a real interaction loop have the potential to become the next multi - billion - dollar market.
Should hardware take the lead, or is the ecosystem the key? Should we focus on the consumer market scale or the enterprise - level payment? Ultimately, it all boils down to the same logic: applications that simply wrap models and hardware with weak interaction are bound to be eliminated. Only the trinity of "multi - modal foundation, AI - native agents, and wearable hardware" can enable AI to truly enter the real world from the chat box.
The following is the content of the round - table dialogue, edited by 36Kr:
Liu Zihao | Co - founder of Hangzhou Yanke Education (Host)
Zhao Weiqi | Head of the Global Open Ecosystem at Leqi
Lu Shaoqing | Head of Technology Management and Head of Multi - modal Products at SenseTime Research Institute
Liu Zihao: Good morning, everyone. I'm Liu Zihao from Hangzhou Yanke. Welcome to this bet on the killer AI product. Today, let's skip the superficial talk and focus on one thing: what will the next mass - market, multi - billion - dollar AI product look like, and in which field will it emerge? We're honored to have two guests today, representing different fields. Please introduce yourselves, teachers.
Zhao Weiqi: Hello, everyone. I'm Zhao Weiqi from Leqi. I'm a serial entrepreneur. I've been working on multi - modal and AI hardware and software devices, with a focus on the consumer market. Currently, I'm in charge of Leqi's global open ecosystem, exploring cooperation opportunities from chips, hardware, operating systems, APIs, applications to universities and non - profit organizations to promote the entire industry.
Lu Shaoqing: Hello, everyone. I'm from SenseTime, mainly responsible for the R & D, productization, and industrial implementation of multi - modal interaction technologies.
Liu Zihao: Thank you, two teachers. If you had to bet on a field that will give birth to the next killer AI product, which one would you choose? You can use your company's implementation cases to support your judgment. Mr. Zhao.
Zhao Weiqi: First, let's define that the next killer AI product must be a continuously online gateway connected to the physical world. From my perspective, this category must be AI wearables. Different types of wearable products may emerge at different times. Currently, AI - AR glasses are the most suitable. They are closer to the user, have a longer online time, and can more easily connect to the physical world.
People spend much more time looking at the real world than at their mobile phones every day. Most of the time, they interact with the physical world. A killer AI product must be high - frequency, meet essential needs, and be used continuously. In this regard, glasses are the best for continuous online use. Mobile phones need to be actively opened and cannot be continuously online.
Now, let's talk about AI agents. With the emergence of AI agents, everyone is developing agents or other solutions, all hoping that their agents can be continuously online. What kind of hardware or product can keep the agent continuously online and accompany us, helping us handle tasks, connections, and records in the physical world? That is the killer AI product.
In summary, it must be a continuously online gateway connected to the physical world, ensuring continuous, high - frequency use and meeting essential needs. This is also the gateway for the next - generation AI. As long as it is a gateway, it must be a killer product.
Lu Shaoqing: Many correct judgments will eventually lead to similar directions. What was just said is similar to what I think. The killer application can be further abstracted. Whether it's from the initial models to the current agents, AI has evolved from single - point intervention to long - term, continuous intervention. Currently, agents, whether it's ChatGPT or others, are still confined to the digital world's chat boxes, and their online time is limited.
I expect that in the next step, we will develop an intelligent agent system that can extend from the digital world to the physical world and truly collaborate with humans in the physical world. Of course, this intelligent agent system can be accessed through glasses or embodied robots, which are all hardware gateways. The biggest challenge for the entire product and technology is to truly achieve continuous and effective interaction with the real world.
For example, if I'm an AI now, after the host and the guest made their statements, most current AIs cannot distinguish who is speaking, when and on what topic they should respond. If the agent that can interact with the real world solves this problem, then AI can truly enter the physical world and collaborate with us.
Liu Zihao: As a former debater, I'll briefly ask some follow - up questions based on the two teachers' sharing. Mr. Zhao Weiqi, there are two types of AI glasses. One type has no display and is more like an AI earphone with a camera; the other is AI - AR glasses with a display. What do you think of the difference between these two types?
Zhao Weiqi: The hardware form can be diverse, in terms of both appearance and function, as well as the target user groups. The presence or absence of a display represents different product forms for different stages and scenarios. Just like mobile phones, there are many types. Glasses without a display are more like an extension of Bluetooth earphones and cameras. They are lighter and suitable for lighter application scenarios. Glasses with a display bring AR into the physical world. After recognition, whether it's voice or other information, there will definitely be feedback. Without a display, feedback can only be provided through a third medium, such as a mobile phone, computer, or voice broadcast. The human brain has a limited bandwidth for receiving information, and vision is the fastest. You can quickly understand the general meaning of a 300 - or 500 - word article with a glance, but if you read it aloud, some people may only remember it for a few seconds and forget the beginning by the fifth or sixth sentence. A display can enhance the human information - receiving bandwidth.
Why is there a difference between glasses with and without a display? Glasses with a display can present the results of AI processing in real - time in the real - world field of vision, completing the interaction loop in the real world. Previously, this loop was not completed or was slow, but now it is completed. In many scenarios, whether it's for the consumer or enterprise market, a display is really needed. For example, navigation requires a display, and most people don't want to look at their mobile phones. The concept of HUD (Head - Up Display) emerged ten years ago. In early BMW or Mercedes cars, there was a small HUD display in the front, allowing you to see information conveniently without interfering with the physical world. This is for the consumer market. In the enterprise market, there are many more scenarios, such as reminders, notifications, and inspections. A display is a great way to present information. That's the demand in different scenarios. Why do people use chatbots and chat boxes? It's to know what's happening and the current situation. If there is only voice broadcast, the brain can't handle so much information. You can visually filter information at a glance. In the era of Native AI, a display is inevitable, with corresponding trade - offs.
AI is not just about listening and speaking. If it can only listen and speak, in my opinion, it's just an observer and doesn't participate. It should help you see and process information. When it has something to say, it's like having a secretary. If the secretary is an introvert and doesn't tell you many things, the efficiency will be lower, and it's better for you to do it yourself. The ability to display, understand the space, and provide real - time feedback in the real world is inevitable. Many manufacturers are developing various forms of products. The fact that the industry is producing so many products verifies to some extent that AI glasses are an important future gateway. There are short - term and long - term trade - offs in form, including changes in business strategies.
Liu Zihao: It seems that a display is a key step for AI glasses to move from being able to listen and capture to truly understanding the real world. The next question is, AI glasses are similar to smartphones around 2010, and everyone is competing for the gateway. What is the most important business model for Leqi? Is it hardware sales, the application ecosystem, or the long - term value brought by intelligent agent services?
Zhao Weiqi: If you've been in the hardware business for a long time, like I have for more than a decade, the first principle to follow is long - termism. Hardware companies that succeed usually take three to five years or more. Otherwise, they may switch to another category. In Huaqiangbei, the business model is short - term and quick - profit. You may produce earphones today, microphones tomorrow, and adult toys the day after tomorrow. It's just a way to quickly realize profits, not a way to promote the industry. Relatively speaking, the original startup teams have their own goals. Our goal was to develop VR, but Leqi has never touched VR in the past ten years. Our goal is to develop terminal devices in the form of glasses that can interact with the physical world, and our business model is based on this. Hardware is the gateway. When hardware achieves large - scale sales and goes from being usable to being good to use, and people start using it, that's the first step in market coverage. When everyone has it, the next step is ecosystem expansion. Once there is technology, the demand will increase significantly. Besides functions like translation, teleprompter, and navigation, one of the most popular agent applications now is "price comparison". This is developed by a partner in our ecosystem. When you see a bottle of Nongfu Spring in the supermarket and ask for the lowest price online, the answer will come out immediately. We couldn't think of this before. Without a large user base, it's not worth it, and there isn't enough motivation for developers and entrepreneurs to work on it. So, the second step is ecosystem expansion.
The third step is to create more long - term value. We've been building the ecosystem for three or four years. There were no agents or AI before, but now with agents, we can see that it plays a role in long - term service. When you buy the hardware, it's not just a tool. You're not just buying a device; you're buying the ability to help you complete tasks in the long run. The hardware is the carrier, and you hope it can provide long - term capabilities. The core of the capabilities lies in the hardware, and the ecosystem needs applications. In the AI era, agents are the key. Native agents can cover all aspects of your life and are very lightweight. This is our business model. The core of our business model is the operating system and the ecosystem, and the hardware is built on top of it. The operating system is designed to maximize the capabilities, allowing more developers to use various technology stacks. It can connect online, offline, edge - side, and cloud - side models. That's the power of the operating system. The ecosystem just needs to be open and inclusive.
The last continuous commercialization strategy is to collaborate with others and jointly release capabilities to enterprise and consumer customers.
This is our current thinking and definition.
Liu Zihao: Mr. Lu Shaoqing, SenseTime has been talking about multi - modal large - language models. SenseTime is not betting on a single app. Is it about bringing multi - modality into real - world scenarios, such as AI hardware, robots, and office intelligent agents? In your opinion, will the real killer for SenseTime be the model itself or the specific applications running on the model?
Lu Shaoqing: The model actually determines the upper limit of the intelligence of the entire product or system. It's the foundation.
Applications, apps, or products combined with AI glasses or other embodied intelligent robots are the gateways. Everyone has mentioned the operating system layer, which is also the core point we're currently working on. This core point has evolved from the basic pipeline to the current intelligent agents and may grow into the so - called AI operating system in the future.
What does the operating system solve? It maximizes the upper limit of intelligence we mentioned earlier. It manages the context, better calls tools, and understands our real intentions at the right time - just like when I don't say anything to it, it can understand and even actively tell me if I need to do something. This layer of the system is the future core.
This system answers the question of how an agent can evolve from the pure digital world to a real collaborator in the physical world and whether it can be my real assistant, not just an assistant in the chat box receiving my messages. Current intelligent agents are based on in - depth thinking logic, a process of multi - round calls, continuous thinking, decision - making, and execution. But in this process, I don't want to see all this in - depth reasoning information. I can analyze the reasoning process for research, but from the user's perspective, I just want to be told at the right time whether I need to intervene or be directly told "I can't handle it, please help me".
For SenseTime, we're not just developing models, nor are we just collaborating with downstream hardware manufacturers. We need to deeply integrate the existing model capabilities to upgrade the text - interaction logic to a system that can actually change the human - machine interaction strategy.
Liu Zihao: Can you give an example of a scenario that SenseTime thought was particularly worth developing? How did you discover this demand?
Lu Shaoqing: I'll give an example of a product on - site, SenseTime's AI explanation brain that is hosting this conference. Previously, when we judged a scenario, I think not only SenseTime but also many entrepreneurs follow the same logic. We need to define a problem: First, is it a high - frequency problem? Only high - frequency problems are worth solving. Second, is the high - frequency problem valuable? Third, can the valuable experience be replicated? We make decisions based on these three factors to determine whether it's worth doing. Only when it's valuable can a product be developed.
Why did we develop the AI explanation brain system? It's also based on these three points. In the past two years, the embodied industry has developed rapidly. Besides performance and demonstration scenarios, we can also explore more commercial value. Robots need to enter both enterprise - level business and consumer - level scenarios. We need to solve the real application problems we mentioned earlier. Only after solving these problems can we ensure the final - mile application.
Liu Zihao: Next, let's start the rapid - fire question - and - answer session. I'll ask three questions, and please answer them respectively. First, is the killer AI product a hardware gateway, software, or an agent?
Zhao Weiqi: First of all, neither software nor hardware alone can be the answer. It depends on who can complete the interaction loop. In the end, it must be the scenario that completes the loop, and that's the real killer product and scenario. Hardware without an agent is just hardware, and an agent without hardware may just be a chat box. You need to integrate more gateways, including software, hardware, and agents, to complete the loop together. That's the final form.
Lu Shaoqing: I basically agree with what the teacher just said. I have a sequence in mind. I personally think that hardware is the gateway and should come first. Only when the hardware is in place can users use the software in it. When will it grow into a killer application? It's not just about the popularity of a single product. It depends on whether it can truly integrate into the lives of consumer - level users and have high user stickiness. After the gateway is established, if the continuous functions can keep users, it can eventually grow into a killer application.
Liu Zihao: The second question is, will the next killer application emerge first in the consumer market or the enterprise market?
Lu Shaoqing: I've been working more on the enterprise market. I think there will be more opportunities in the enterprise market. Both the consumer and enterprise markets are possible, depending on the business and product form.
For example, taking the combination with embodied intelligence as an example. From the perspective of the product form, I personally think that in the combination with real - world physical interaction, the upgrade of intelligent agents and hardware has a clear boundary effect in the enterprise market. The requirements for consumer - level application implementation are much higher than those in the enterprise market because the enterprise market has relatively controllable vertical scenarios. So, taking this scenario as an example, it's better to polish it in the enterprise market first. For different product forms, the situation may be reversed. It depends on the specific product form.
Zhao Weiqi: It depends on how you define a hit product. The consumer and enterprise markets have completely different scenarios and goals. The consumer market is more about large - scale scenarios, which means a large number of people use the product. It must be strongly related to daily use, high - frequency, and able to solve practical problems. In the enterprise market, we've also done a lot of projects. The willingness to pay