A camera is attached to the earphone, but it's not for human use.
As the end of the year approaches, the previously unknown startup company, Guangfan Technology, has released a product that seems somewhat "counter - intuitive": the Lightwear AI All - Senses Smart Suit (hereinafter referred to as Lightwear).
Generally speaking, this thing is a set of smart earphones and a smart watch. But the specific details are even more interesting:
First of all, each earphone is equipped with a camera with 2 million pixels. The weight of a single earphone is 11g to ensure sufficient battery life for the visual function. The smart watch serves as the display terminal and an additional interactive input tool. However, the smart center of this suit is not necessarily a mobile phone, but the earphone case with built - in eSIM capabilities and a GPS chip. The smart watch can be directly connected to it —
This means that Lightwear can completely operate independently without a mobile phone.
This unique practice of design concept is unprecedented in the industry. The exposed cameras hanging on the earphones and placed beside the ears are more challenging to the general aesthetic than smart glasses with cameras, and they also touch on the sensitive nerve of privacy.
However, if we look at the direction that the entire technology and consumer electronics industry is heading in the next 5 - 10 years, you will find that OpenAI, Meta, Alibaba Quark, Li Auto, and Apple have reached a consensus on similar product definitions. Guangfan Technology has productized this consensus before these giants and large companies.
That is: AI needs to truly understand the world, and relying solely on microphones is no longer enough.
On the other hand, the multi - modal capabilities of models are forcing product design to meet the needs of the models.
That is, whether it is Guangfan's earphones with cameras or the more widely accepted but still controversial smart glasses, these product forms are the results forced by the capabilities of the models, and have nothing to do with aesthetics.
An AI hardware company spun off from Xiaomi
Guangfan Technology was founded in October 2024. Its founder, Dong Hongguang, is a member of Xiaomi Group's startup team with employee number 89. During his 14 - year tenure at Xiaomi, he participated in the R & D of high - level projects such as MIUI, fast apps, self - developed mobile phones, and automotive OS as a core member.
According to the company's official introduction, the founding team is a typical "high - level team". In addition to Xiaomi, it also includes senior experts from companies such as Huawei, ByteDance, Alibaba, and Tencent, with profound capabilities in software, hardware, and AI development.
What's even more remarkable is the speed of capital accumulation. Guangfan Technology quickly completed two rounds of financing totaling 130 million RMB within three months, with a post - investment valuation exceeding 500 million RMB. The investors include well - known funds and institutions such as Borui Capital (founded by Li Ping, the vice - chairman of CATL), AfterShokz, Tongge Venture Capital (under Goertek), Qinghui Investment, CDH Investments, Alpha Square Group, and Inno Angel Fund.
Among them, the industrial capital is quite eye - catching, mostly from audio and high - tech manufacturing giants: AfterShokz occupies more than 50% of the bone - conduction and open - earphone market, Goertek is the leading ODM in the wearable device market, Qinghui Investment is backed by GigaDevice, a leading storage company, and there's no need to mention CATL.
The participation of these industrial capitals not only provides room for trial and error for this company and this immature form but also demonstrates the early - stage layout actions of industrial giants.
The presence of the camera is to enable AI to see what you see
In the past 20 years, the main line of human - computer interaction has been very clear: typing, touching the screen, taking pictures, uploading, and then waiting for the device to respond. Although the software and services built into the device can do a great deal today and have strong capabilities, the logic of interaction remains unchanged: you control the device, and the device gives you feedback.
In the past 3 - 5 years, the new wave of AI based on large language models has completely changed this logic. Since the models have the ability to process multi - modal information, understand the relationships between images, sounds, and text, and have capabilities closer to "human intuition", AI products driven by large models can initiate more proactive interactions with users and the digital world they are in — even the real world.
From OpenAI, Apple, and Meta in Silicon Valley to domestic large companies, AI devices equipped with cameras have become a consensus direction. The reason behind this is not complicated: voice can capture "the world you describe", but with a camera, AI can truly understand "where you are", "what's in front of you", and "what's happening in the world".
The imagined OpenAI earphone hardware
The question is: Do I have to take out my phone every time AI needs to understand something? Is there no better place for the camera?
There are only two practical options: wear it on the head or attach it to the body.
By the end of 2025, we have seen countless pioneers, failures, leaders, and laggards in these two areas.
In the field of wearable devices, the Humane AI Pin and Rabbit R1 were once regarded as "the next iPhone" in Silicon Valley, but they ended prematurely due to being too early and having poor results. However, there are still people constantly innovating in this field, such as the recently re - introduced Looki.
People also remembered the once - popular Google Glass and VR headsets from more than a decade ago and combined them to create a new generation of smart glasses. Currently, this category is highly regarded in Silicon Valley. Since it can be organically combined with daily - worn glasses, it has a relatively higher acceptance rate. However, some people still think that smart glasses are not ideal and will not become a real replacement for mobile phones.
Then, earphones came into the picture. Among mobile phones, wearable devices, and smart glasses, earphones occupy a delicate position: they are socially accepted for long - term wearing and are naturally close to the two core senses of "vision" and "hearing". This makes them a reasonable carrier for AI's perception and computing capabilities and the next trial - and - error space for AI hardware.
Earphones are closer to the eyes and ears, and consumers are well - educated about wearing them, with a wide acceptance rate. More importantly, compared to the obviousness and heaviness of glasses (at least more than 40 grams), Lightwear earphones are not only light (11g per earphone). Although the addition of the camera gives them a somewhat "foreign - body" feeling, they are less noticeable in social situations than glasses.
The product logic of model - first rather than user - first
The market for AI earphones that rely solely on voice recognition is relatively saturated and has clearly entered a bottleneck stage. According to the observation of ifanr, most of the so - called AI earphones on the current market are priced at around or below 1,000 RMB and mainly focus on AI translation scenarios, with their functions becoming increasingly homogeneous.
What Guangfan thinks about and does with Lightwear is very different from ordinary earphones. Ordinary earphones seem to be confined to the category of "hearing", but Guangfan takes one step further and thinks about a deeper question: Can AI obtain more context through earphones?
The answer to this question actually lies in the fundamental transformation of the interaction mode in the AI era.
From computers to mobile phones, so far, it has been the era of GUI (Graphical User Interface), where screens, buttons, and icons are indispensable because we precisely control each operation object.
However, generative AI has changed this logic: interaction can rely entirely on natural language. You give the system vaguely described instructions, and the system returns results that are not precise but usable. Frequent communication and feedback become more important, while precision is less crucial — that is, NUI (Natural User Interface). Speaking and listening have become more natural ways. The graphical interface has become non - essential.
It is quite reasonable for this new interaction paradigm to be implemented in earphones: earphones can weigh 10g or even less, be worn without burden, have long battery life, and be online all day long. It's like having an intelligent external device for the human body, always online and ready to serve.
But this intelligent external device still lacks one thing: like humans, it needs to receive enough information. Among all perception dimensions, vision is the most information - rich and important one.
So, the conclusion is clear — a camera should be added to the earphones.
At the product launch event, Guangfan demonstrated the practical applications of Lightwear's combined perception capabilities. These scenarios cover high - frequency needs in daily life and work:
- O2O scenario: The user wakes up the device and asks, "Help me check how this place is." The earphones identify the restaurant's signboard through the camera, confirm the location with GPS, compare personalized tastes based on the memory accumulated by the AI product, recommend better nearby restaurants, take a queue number actively, and provide intelligent reminders when it's the user's turn.
- Business travel: When receiving a business - trip text message/email, Lightwear can actively arrange the schedule, identify and resolve schedule conflicts, reply to the text message/email intelligently, search for and book flights and hotels, and complete the last - mile taxi - booking process.
- Shopping: When the user sees an interesting product and asks a question, the earphones can directly recognize it visually, compare prices online, add it to the shopping cart, or even place an order directly.
- Daily reminders: Based on the schedule, it actively wakes up and reminds the user (such as important anniversaries)
Throughout the process, the user doesn't need to take out the phone, operate an app, or even clearly state what they want — AI combines visual and geographical information to complete the necessary context on its own.
This type of device is naturally suitable for the following scenarios: when you can't clearly describe something ("this one" or "no, the one next to it"); when it's not worth taking out the phone to take a photo, or when taking out the phone will disrupt the "flow state" (such as walking, visiting an exhibition, or cooking), etc.
Is 2 million pixels enough? Yes, because the photos are for the model to see
If we look at Lightwear from the perspective of traditional consumer electronics products, there are indeed many flaws: the exposed camera causes great privacy concerns; it's heavier than ordinary earphones, and wearing it all day may not be practical; there are social pressures; and it's easy to associate it with failed products like Google Glass and AI Pin, etc...
However, this completely misses the point. Adding a camera to the earphones actually serves the understanding efficiency of AI. The camera is not for human use at all. The starting point of this design is to serve the model. The model needs a more continuous and timely visual stream and a more real FPV.
There is a key design worth noting here: Lightwear's camera adopts an "ephemeral" image - processing mechanism.
In the Lightwear system design, you can't command the earphones to take a photo for the purpose of "taking a photo". This is because the camera is entirely for AI, used for instant visual context understanding. The photo files will not be saved locally or in the cloud, which can be understood as "destroyed after use". There are several considerations behind this design:
Obviously, the primary consideration of this design is to protect privacy. By not saving image files, privacy leakage can be fundamentally prevented. Users don't need to worry about their daily life details being captured or even "secretly photographed" and saved in unexpected situations.
Moreover, not saving photos can also significantly optimize costs: since it's for AI to see, the image quality doesn't need to meet the human - eye standard at all. 2 million pixels are already sufficient for object recognition and scene understanding. The lower the pixels, the faster the processing speed, the lower the power consumption, and the smaller the storage and traffic costs. Currently, the device can last for 9 - 15 hours, which is enough for all - day companionship.
Of course, the assertion that this product is "model - first and user - second" is just my subjective opinion. Others, including Guangfan, may have different views. At the product launch, Dong Hongguang emphasized that AI hardware should "let technology take a step back and put people at the center", but at least in my logic, the actual product shows that technology comes first.
But these days, which AI hardware can avoid such a sense of contradiction?
Here, we can boldly make a claim: All AI hardware currently and in the near future should be product - defined with the model as the priority and meeting the model's needs as the first starting point.
Since we are far from exploring the boundary of the combination of AI models and electronic hardware products, there is no doubt that we will see more things like Lightwear in the future, which you could even call a "Frankenstein".
Only by making more attempts, although most of them are trial - and - error, can these product companies truly find out where the boundaries are and bring better user experiences.
In conclusion
Of course, Lightwear is a real product to be launched. The price of this suit is not cheap. Here, I don't want to give Lightwear an overly high evaluation to avoid creating an illusion for everyone.
At the product launch, we tried out the "engineering prototype". High - frequency and essential scenarios such as schedule management, message reminder and forwarding, business - travel booking, taxi - hailing, restaurant review information and queue number taking, and visual search/add - to - cart all ran smoothly.
However, due to the direct connection between the earphones and the earphone case (eSIM 4G network) and the average on - site network, the delay in the dialogue was still quite obvious, still a long way from the ideal dialogue rhythm in the movie "Her". The on - site engineer revealed that the experience of the engineering prototype has reached about 70% - 80% of the level of the commercial version to be launched in Q1 of next year.
To be honest, after experiencing Lightwear, I'm quite satisfied with it. I think that the "AI earphones with cameras" that OpenAI and Apple are planning to launch in 2026 or 2027 may not have a much better experience than Guangfan's solution —
This has nothing to do with product strength or engineering capabilities. It