StartseiteArtikel

Feng Dagang Interviews Misa, Founder of Rokid: Why Might Mobile Phones Become Accessories for Smart Glasses in Five Years?

晓曦2026-01-20 15:44
How far is the "iPhone moment" for AI glasses?

In the past year, AI hardware has returned to the center stage of the tech industry.

From mobile phones and computers to cars, from headphones and speakers to various wearable devices, when large models can support the interaction of almost all electronic devices, all hardware manufacturers are trying to become the hardware form that carries AI capabilities.

Is it a more powerful mobile phone? A smarter car? Or a brand - new computing entrance close to the human body?

In these explorations, AI glasses are being pushed to the forefront in a more certain way. Different from XR glasses, which have been repeatedly tried in the past decade but have always struggled to enter the mass market, AI glasses are generally expected to carry the next - generation human - machine interaction paradigm: always online, available at any time, and capable of completing information acquisition, shooting and recording, route navigation, instant translation, AI Q&A, and environmental understanding without taking out the mobile phone.

The mobile Internet has changed the relationship between "people and information", and large models are rewriting the relationship between "people and the world". And glasses happen to be at the entrance of this relationship - they are at the center of human vision and attention.

For many years, Rokid has been thinking and investing in this direction.

Rokid has experienced the hottest years of the AR concept and the long hibernation after the capital and market cooled down; it has witnessed the rise and fall of the narrative of the "next entrance" in the industry and borne the cost of a company's long - term investment in an uncertain track.

Currently, the smart glasses track shows an obvious differentiation trend: some products still mainly follow the design logic of audio glasses, adding limited model capabilities in addition to basic voice interaction, which belongs to the "light AI" route; while Rokid has chosen a seemingly more arduous route, striving to create a full - stack AI glasses with real multi - modal interaction capabilities. From the camera and microphone at the input end, to the display and voice feedback at the output end, and then to the AI application system that supports large - model access and ecological expansion, Rokid's product strategy aims to build a more comprehensive AI terminal usage experience rather than a simple superposition of single functions.

In this special session of the Alibaba Cloud Tongyi Smart Hardware Exhibition, Feng Dagang, the CEO of 36Kr who has been observing tech startups for more than a decade, had an in - depth conversation with Misa, the founder & CEO of Rokid who has been deeply involved in the smart glasses track for ten years, around the topic "How far is the iPhone moment of AI glasses?"

They talked about:

How to make AI glasses a high - frequency device: Entertainment products solve the problem of the "giant screen", but they are usually "nice to have". Rokid values high - frequency and long - term wearing and use more, aiming to turn glasses into a daily device like a watch. This is why they repeatedly emphasize scenarios such as "daily usage time", "direct message delivery", "walking and cycling navigation", and "casual shooting" - these are the ways to turn glasses from toys into habits.

The difference between glasses cameras and mobile phone photography is: The advantage of glasses cameras is not to replace mobile phone imaging, but to make up for the moments that mobile phones can't capture: more natural, less intrusive, and lower - cost recording.

How to make large models the infrastructure of products: Large models are the foundation, but whether they can be turned into products depends on the end - side experience and infrastructure: stability, latency, connection, interaction, battery life, and engineering reliability. Rokid chooses to access Tongyi Qianwen while remaining open and switchable.

How to ensure that AI glasses can develop their own ecosystem: The Agent Store and third - party capability access (such as vehicle control) they mentioned point to a more platform - oriented attempt: first, make the most core high - frequency capabilities solid, and then let the ecosystem grow on the glasses.

Will glasses be the new carrier in the AI era: Misa's core judgment is straightforward: the AI era definitely needs a new carrier, and mobile phones will not continue to be the only main device in the next era. Whether glasses will be the final form has not reached a consensus, but precisely because the consensus is not yet formed, startup companies have a window period to place their bets.

The discussion of these issues also explains a question that often confuses the outside world: Why have smart glasses been hyped for so many years, but only now do they seem to be really taking off?

The answer may not lie in the glasses themselves, but in the fact that large models have finally filled in the missing piece of the puzzle - in the past, the "intelligence" was too weak, and glasses could only tell stories based on display and hardware form; when visual understanding, voice interaction, instant Q&A, and environmental recognition become available, glasses have a reason to exist independently of mobile phones.

So the industry has entered a new differentiation period: on one hand, large companies are entering the market, promoting sales through channels, subsidies, and ecosystems; on the other hand, long - term players like Rokid are trying to prove that the victory of entrance - type hardware does not only depend on advertising, but also on the product's ability to change users from trying it out to making it a habit.

This also brings the question back to the origin of this conversation:

What exactly does the iPhone moment of AI glasses mean?

Is it the inflection point when sales increase from hundreds of thousands to tens of millions? Is it that the "impossible triangle" of "battery life - display - comfort" is overcome by engineering capabilities? Or is it the emergence of a "killer application" that is high - frequency, highly necessary, and capable of changing daily behavior?

In an event like the Smart Hardware Exhibition, full of slogans and visions, the conversation between Feng Dagang and Misa represents long - term observation and judgment on one hand and ten - year investment and practice on the other. They are trying to find a direction closer to the answer in the place with the most noise.

The following is the transcript of the conversation, edited by 36Kr:

Feng Dagang: Today's event is a bit like the smart glasses market, noisy but full of vitality. Only by expressing oneself loudly can one be noticed.

Misa: Yes.

Feng Dagang: Misa is the founder of Rokid. In February 2025, Rokid made a splash when you stood on the stage and said, "Today, my speech manuscript is on my glasses."

Misa: Yes, it's inside the glasses.

Feng Dagang: As an investor and media person who has been following hardware for many years, I was very happy that day. I felt that I finally found the answer to the question of what smart hardware should do. Did Rokid become popular after that day? How did you feel after becoming popular?

Misa: We expected smart glasses to explode in 2025, but we didn't expect two things: first, the explosion was very intense; second, we didn't expect it to be triggered by us. I originally thought it would be a joint effort of the market, but I didn't expect that speech to suddenly boost the entire smart glasses track in 2025. It was quite unexpected.

Feng Dagang: What routes do you think AI glasses are taking?

Misa: The first type is entertainment - oriented, such as watching movies and playing games, focusing on the giant - screen experience. The second type is light - AI, evolved from audio glasses with a little added AI ability. Meta Ray - Ban is a representative, and Xiaomi and Li Auto also belong to this category. The third type is full - function AI glasses, which need to have a display, microphone, speaker, and complete AI capabilities, with a complete input - output system. There aren't many products in this category, and Rokid is currently the most complete and mass - produced one. Basically, there are these three types: entertainment, light - AI, and full - function AI.

Feng Dagang: So which category does Apple's head - worn device belong to?

Misa: It's more like a head - mounted display.

Feng Dagang: How many AI glasses were sold worldwide in 2025?

Misa: Including new products like Meta Ray - Ban, the total was about 5 million units, of which Ray - Ban accounted for more than 3 million units. Among full - function glasses, Rokid accounted for the vast majority, with a market share of over 70%. It is expected that Rokid will still account for more than half in 2026.

Feng Dagang: How many units does a 50% market share approximately represent?

Misa: We shipped more than 200,000 units last year, but actually only in three and a half months. This year, a rational forecast is 800,000 - 1 million units.

Feng Dagang: What needs of users do the three different types of AI glasses meet respectively?

Misa: Entertainment - oriented glasses are mainly for giant - screen movie watching, which is a "nice to have" with relatively low demand. The sales volume of our entertainment - oriented glasses, Rokid AR Lite, is only 1/10 of the full - function model, with a daily usage time of about 1 hour. In contrast, Rokid Glass is worn for about 8 hours a day on average, and I basically wear it all day.

Feng Dagang: Why not 16 hours?

Misa: 8 hours is the average. Some people only wear them when needed.

Feng Dagang: Is there a recognized ultimate ability in the industry that we all will need in the future?

Misa: There is a demand for multimedia content, but it faces an "impossible triangle": it is difficult to balance display power, battery life, and wearing comfort. If this can be broken through in the next three to five years, the combination of the three will be the future: you can use the giant screen when you want, use AI assistance when you need, and use it as ordinary glasses when appropriate.

Feng Dagang: Why do we need three - in - one glasses? Is it the future? What exactly can it do that mobile phones can't?

Misa: From our current user data, several high - frequency scenarios are clear. First is photo and video shooting. Many people think that glasses are for replacing mobile phones in photography, but actually, they are for complementing. Mobile phones emphasize "taking good photos", while glasses emphasize "capturing moments", recording moments anytime and anywhere in a more natural way. For example, when taking pictures of children or conducting interviews, the subject may feel unnatural when seeing a mobile phone, but glasses are more discreet. Second is information reminder. While I'm chatting with you now, dozens of WeChat notifications have already flashed on my glasses. Most of them don't need to be dealt with immediately, which allows me to focus more without frequently taking out my phone. Third is AI Q&A. For short and quick questions, such as asking about the weather or looking up information, you can just ask the glasses directly, which is faster than opening a search engine. Fourth is navigation. When walking or cycling, following the instructions in the glasses is much more convenient than holding a mobile phone. What really differentiates it from mobile phones is that it provides a more natural and instant "active AI service".

Feng Dagang: In the long run, will glasses be a mobile phone accessory or an independent device?

Misa: They may still be accessories in the next three years, but there will be changes in three to five years, and it's very likely that they won't be accessories after five years. Mobile phones may become terminals for computing, communication, and storage, and the interaction will mainly be on the glasses. Glasses are also a better carrier for "active AI".

Feng Dagang: How long is the battery life?

Misa: It can be used continuously for about 6 - 8 hours, and can last a day with intermittent use.

Feng Dagang: The next question will be a bit more pointed. You said that Rokid is the number one in China, but the global annual sales of smart glasses are only in the hundreds of thousands, which is very small compared to the hundreds of millions of mobile phones sold annually. What's your view on this?

Misa: First of all, I rarely say publicly that Rokid is number one. Although the data shows so, I don't want to trigger discussions like "the number - one company in the industry only sells two or three hundred thousand units". Instead, I'd like to share some dynamic figures: First, during this year's Double 11, Rokid's sales increased by 800% year - on - year, a nine - fold increase. Second, from more than 200,000 units last year to the target of 1 million units this year, it's a 400% increase. This growth inflection point is very much worth looking forward to. Third, if a single SKU sells 1 million units, in the consumer electronics field, it can already match the sales of many mobile phone models. So, although the current total volume is small, like a "primary school student", the growth is extremely fast, and it will soon enter the "college entrance examination" stage. I believe that by 2027 and 2028, the industry will reach the tens of millions of units level. It's only a matter of time before the entire market (including audio glasses) exceeds 10 million units this year.

Feng Dagang: I'd like to follow up. We've been saying "AI glasses will be the next entrance" for many years. Why do we still have to say "wait for another two or three years"? Is it because the products are too expensive, the functions are not good enough, or we haven't found the "killer application"?

Misa: I think the core problem is market education. The real explosion point is the maturity of large AI models, which has only happened in the past one or two years. Before large models like Qianwen and DeepSeek became popular, the so - called AI was relatively "stupid". It's only been one year from the explosion of large models to the popularity of our products. Look, among the people here today, maybe half haven't even heard of Rokid. This shows that market awareness still needs time. The industry was just ignited last year, and this year and next year, we'll deepen market education together, and the scale of tens of millions of units will soon be achieved. I'm very confident personally.

Feng Dagang: Speaking of large models, I know that you're using Alibaba's Qianwen model today. Why did you choose it?

Misa: There are two main reasons. First, Rokid is a believer in open - source. We'll give priority to using open - source options, which is safer for our startup company. When we planned our products last year, the main open - source and excellent choices were Qianwen and DeepSeek. Second, Alibaba Cloud's infrastructure provides more stable support, so we defaulted to accessing Qianwen.

Feng Dagang: Since we're talking about Alibaba and this is an event hosted by Alibaba Cloud. How do you view the competition between large companies like Alibaba and your startup company?

Misa: This is an interesting question. I worked at Alibaba for four years. I think the biggest challenge for large Internet companies when making hardware is often not their ability, but "persistence". Hardware startups are like a marathon, requiring all - in focus and patience. But large companies have many businesses. If a project is not a strategic core in the short term, the resource investment and team attention may shift. This is the core advantage for startup companies - we have no choice but to persevere. Of course, large companies have a strong start with their "three - pronged approach": media resources, channel subsidies, and ecological collaboration. For startup companies to withstand the competition, we must establish our brand, make our product fundamentals solid, and rely on user word - of - mouth and community dissemination to break through. Our users at Rokid have a high recommendation rate,