HomeArticle

Interview with Song Gang, the person in charge of Alibaba's intelligent terminal business: The evolution of AI glasses, a tough battle for ecosystem and battery life.

未来一氪2025-07-31 11:59
At WAIC 2025, Alibaba will launch the Quark AI glasses, emphasizing user stickiness and the integration of ecological technologies.

As large models transition from the technological singularity to the industrial foundation, and as intelligent agents move from laboratories to production lines and clinics, the third wave of artificial intelligence is reshaping the global economic fabric with unprecedented sharpness.

China demonstrates dual advantages in this transformation: it is not only a testing ground with ultra-large-scale application scenarios but also launches attacks in deep waters such as chip breakthroughs and algorithm open-sourcing. From breaking through single-point technologies to ecosystem-level innovation, from efficiency tools to new-quality productivity engines, an AI development path with Eastern characteristics is accelerating to emerge.

On July 26, the World Artificial Intelligence Conference (WAIC 2025), themed "Intelligent Era, Global Solidarity," gathered technology giants, academic pioneers, and policy-makers from the AI field. This super feast covering technology, ethics, and art indicates that AI has evolved from an "industrial variable" to a "civilizational constant."

At this grand event on the artificial intelligence industry, 36Kr not only acts as an industrial observer but also deeply participates as an industrial connector. It built the "Krypton Star Live Studio" in the exhibition hall to reveal the underlying logic of the advancement of the artificial intelligence industry through dialogues.

In the dialogue, Song Gang, the person in charge of Alibaba's intelligent terminal business, said: In the future, the key indicator determining the market pattern of AI glasses is user stickiness, that is, user activity and user time. Alibaba's goal for its glasses is to create a pair of super glasses that are comfortable to wear and easy to use, and ultimately become users' personal mobile entrance in the AI era.

The following is the transcript of the dialogue, edited by 36Kr:

36kr: Please briefly introduce yourself and your company. How do you feel about attending this WAIC, and what have you gained?

A: I'm Song Gang from the intelligent terminal business of Alibaba's intelligent information business group. Our team has been focusing on the design, R & D, production, and sales of intelligent terminal hardware, such as the Wowo Bestie Machine, smart speakers, and the Quark AI glasses showcased at this WAIC. I'm very glad to participate in this conference and discuss with international experts and industry partners from home and abroad how to empower life with AI and bring more technological artificial intelligence products to consumers.

36kr: Currently, AI glasses products have three major pain points: useless scenarios, short battery life, and uncomfortable wearing. How does Alibaba's glasses achieve the triangular balance of performance, power consumption, and comfort? What are the exclusive engineering innovations?

A: In terms of scenarios, we have transformed the Quark AI glasses into a "real AI portable entrance" to meet users' more scenario needs through "Alibaba's self-owned ecosystem construction + industry expansion cooperation."

We have also combined the internal ecosystem of the group to build an AI capability matrix covering vertical scenarios such as search, navigation, payment, personal travel, and business travel, enabling AI to better serve users' lives.

In addressing the current insufficient battery life of AI glasses, the Quark AI glasses adopt a dual-core and dual-system design. They optimize battery usage through the intelligent scheduling of a high-performance chip and a low-power Bluetooth chip. We have also made innovations in engineering, using a dual-battery + battery replacement design. Users can quickly replace the main battery by hot-swapping the temples. Paired with a battery replacement bin the size of an earphone case that can be carried around, it can achieve 24-hour battery life for all-day use.

To enhance users' wearing comfort, the contact surfaces of the nose pads and ear bends of our glasses adopt a bionic curved surface design to achieve uniform pressure distribution. At the same time, we precisely control the device's center of gravity at the geometric center of the glasses to achieve the best balance when wearing. Moreover, we use titanium alloy integrally injection-molded temples combined with an elastic ratchet hinge design. While ensuring light weight, it achieves stable structural wrapping in sports scenarios.

In summary, the Quark AI glasses achieve the triangular balance through in-depth system customization, cutting, and optimization, as well as in-depth self-research and industrial device customization capabilities.

In terms of the overall machine architecture, by customizing and developing small speakers with dual voice coils and large diaphragms, designing ultra-narrow integrated FPCs, and using high-refractive-index lenses with coating technology to further reduce the grating area of the waveguide region, our Wowo glasses have thinner temples, narrower and thinner frames, and more transparent-looking lenses.

In terms of wearing, the contact surfaces of the nose pads and ear bends adopt a bionic curved surface design to achieve uniform pressure distribution. At the same time, we precisely control the device's center of gravity at the geometric center of the glasses to achieve the best balance when wearing. Moreover, we use titanium alloy integrally injection-molded temples combined with an elastic ratchet hinge design. While ensuring light weight, it achieves stable structural wrapping in sports scenarios.

In terms of the electronic architecture, it adopts a dual-battery and battery replacement design. Users can quickly replace the main battery by hot-swapping the temples. Paired with a battery replacement bin the size of an earphone case that can be carried around, it can achieve 24-hour battery life for all-day use.

In terms of the system OS, it is paired with an Android + RTOS dual system to achieve dynamic resource scheduling, which not only significantly improves energy efficiency in heavy-load scenarios but also effectively reduces power consumption in standby scenarios.

In terms of the imaging system, we independently developed the Super Raw low-light processing algorithm. Through multi-frame fusion and adaptive noise reduction in the RAW domain, it effectively suppresses noise in low light and significantly improves the image signal-to-noise ratio.

36kr: The glasses integrate Alibaba's service ecosystem. In your opinion, which type of scenario can become users' daily necessity? What technical barriers need to be broken through in such scenarios?

A: We believe that any life demand scenarios related to users' food, clothing, housing, and transportation that are penetrated by AI technical capabilities and services may become users' necessities. Alibaba's ecosystem scenarios can well empower these consumer necessity scenarios. For example, various information reminders (schedules, information reminders) are high-frequency life scenarios. Through cooperation with Fliggy, we can achieve flight information reminders.

With AI glasses, users can pay safely and quickly without taking out their mobile phones. Especially in scenarios where hands are freed, AI glasses can bring convenience to users, such as scanning codes for payment while walking and outdoor hiking photography. In cross - language communication scenarios, functions such as translation and simultaneous interpretation are also very important.

For scenarios where work and study need to improve efficiency, functions of AI glasses such as meeting records and encyclopedia Q & A can also help users improve efficiency.

The first technical barrier to be broken through is AI's perception and understanding ability, including multiple aspects such as voice and image recognition, semantic understanding, and intention judgment. This requires the support of a powerful AI central control ability to reasonably disassemble tasks and call different ecosystem services. Only then can it truly achieve "hearing clearly, seeing understandably, understanding accurately, and answering well," and form better user stickiness.

The second is the end - cloud collaboration ability. The glasses end needs to complete responses with low power consumption, and the cloud end needs to connect to the cloud AI services of different ecosystem partners to complete information processing. The two ends need to cooperate and respond under different network conditions to achieve a fast and accurate experience closed - loop.

Alibaba's advantage lies in that it has both multi - modal AI capabilities and controls the service ecosystem and system capabilities. It can integrate these technologies into real scenarios, making AI glasses truly "useful, easy to use, and indispensable."

36kr: When facing industry competition, what is the core barrier of Alibaba's glasses? Which technological breakthrough do you think will lead to a complete explosion in the AI glasses industry in the next three years?

A: The AI applications innovated based on Alibaba's self - developed large models, the ability to integrate internal and external ecosystems, and the technological innovation based on self - developed software and hardware capabilities constitute the core barrier for Alibaba to make Quark AI glasses.

The technological breakthrough points include three aspects: hardware, AI capabilities, and ecosystem. In terms of hardware, improvements are needed in multiple aspects such as near - eye display, photography, and perception capabilities. At the same time, achieving miniaturization of the appearance is also a key factor.

Human - factor wearing: High - integrated flexible circuit boards, high - capacity and high - density batteries, and next - generation innovative composite materials to build miniaturized and ultimate wearing experiences.

Near - eye display: Technological breakthroughs in miniaturized dual - optical - engine binocular display to promote the improvement and implementation of industry pain points such as display power consumption and thinness.

Acoustics and audio: Improvement of the adaptive and personalized services of the open acoustic system combined with the user environment.

Photography and perception: Integration and improvement of camera sensors, computing capabilities, and imaging/perception algorithms to revolutionize imaging and visual perception capabilities.

AI capabilities: For AI glasses to achieve true "intelligence," in addition to powerful voice recognition capabilities, it lies more in the comprehensive evolution of understanding and decision - making capabilities.

Advanced voice recognition and understanding: High - precision voice recognition in noisy environments (noise reduction, separation), support for colloquial and soft - voice interactions, support for real - time multi - language translation, and low - latency response.

Multi - modal dialogue understanding ability: Based on voice, text, and visual multi - modal dialogue understanding ability, it can more accurately understand users' interaction intentions, enabling users to shift from "operation instructions" to "directly expressing needs in one sentence." First - person visual multi - modal understanding allows users to both hear and see, providing a more natural experience.

Super intelligent agent: AI glasses may become users' closest digital assistants, capable of not only answering questions and executing commands but also handling complex affairs (super intelligent agents) and understanding users' long - term goals and values.

End - cloud collaborative reasoning ability: AI models need to run quickly on the lightweight and low - power end - side while combining the capabilities of cloud - based large models to achieve high - precision complex task reasoning. The two need to switch adaptively under different network and computing power conditions to ensure a stable experience and instant response.

Continuous learning and personalization: The model needs to have the ability to continuously learn users' habits and contexts to build a more personalized service system, realizing the evolution from "general AI" to "exclusive AI" and truly becoming users' personal portable intelligent assistants.

AI content ecosystem: The usage frequency and stickiness of AI glasses ultimately depend on the content and service ecosystem established around them. We integrate the main content ecosystems of the entire industry and provide integrated content services from various dimensions.

From passive acquisition to active recommendation: In the supply mode of content and information services, based on users' current scenarios and historical behavior preferences, actively push valuable information, services, and action paths, truly achieving "anticipating what users think."

Deep integration with the ecosystem: In terms of the breadth of ecosystem content, continuously expand from current core scenario services such as search, navigation, payment, personal travel, and business travel to more scenarios such as daily travel, shopping, learning, and work, enabling users to complete the complete closed - loop from "perception" to "action" just by wearing the glasses.

Deep integration of vertical knowledge and services: In terms of the depth of ecosystem content, by combining with vertical industry content systems (such as tourism, education, and business), build high - quality, structured, and understandable AI content sources, giving AI glasses the content value for long - term retention and repurchase.

Third - party content service ecosystem: Support the access of third - party content and capabilities to promote content and service developers to jointly build a new ecosystem for AI glasses, making AI glasses a growing intelligent platform.

36kr: After the complete explosion of AI glasses, what do you think is the ultimate indicator determining the market pattern? What is the goal of Alibaba's glasses?

A: We believe that the key indicator determining the market pattern of AI glasses is user stickiness, that is, user activity and user time. The main factors determining user stickiness are:

1. Continuously solve problems and provide value for users: Have excellent overall machine software and hardware basic experiences, respond quickly and accurately to users' needs, and even predict needs; the device can be used all - day in multiple scenarios; have a rich and practical application and content ecosystem: diverse AI - native services can cover all aspects of users' daily lives; have strong multi - modal understanding and scheduling capabilities to better understand users' intentions and quickly and accurately provide various AI services.

2. Continuously penetrate and grow in the product market: Have a rich and diverse product matrix. Enter the AI glasses market through the product combination of "high - end display + mainstream photography," establish brand awareness with high - end products, and cover the mass market with mainstream products, forming a synergistic effect of "high - end benchmark + scale growth"; improve the sales service experience: provide users with professional and comprehensive optometry, glasses manufacturing, and vision data services in the glasses industry.

Alibaba's goal for its glasses is to create a pair of super glasses that are comfortable to wear and easy to use, and ultimately become users' personal mobile entrance in the AI era.

36kr: Looking ahead to the next 3 - 5 years, driven by AI, what are the most transformative reconstructions you think will occur in your industry? Where will the key impact of this reconstruction on industrial competition be reflected? How does your company make early arrangements?

A: AI glasses have the triple attributes of consumer glasses, AI innovative technology products, and digital electronic products. Therefore, the core expected changes are mainly as follows:

1. In the field of glasses: Integrate with glasses brands in terms of technology, channels, services, and C2M customization capabilities to promote the transformation of the glasses industry value chain.

2. In the field of consumer electronics: Build in - depth self - developed software and hardware development capabilities, and cooperate with upstream core supply chains (such as optics, chips, sensors, etc.) to create a new natural interaction method integrating AI multi - modal perception capabilities, promoting the transformation of human - machine interaction methods.

3. In the field of the AI ecosystem: From limited AI service scenarios to diversified service scenarios; from users choosing AI services to AI actively perceiving and understanding users' intentions and actively recommending AI services. Promote the transformation of the AI application content ecosystem.

We will cooperate with globally leading glasses brands, integrate Tongyi Qianwen's large - model capabilities with Quark AI application capabilities to build an intelligent interaction experience that can hear and see. Driven by the "Alibaba's self - owned ecosystem construction + industry expansion cooperation" dual - wheel model, we will build an AI service matrix covering high - frequency scenarios such as search, navigation, payment, and business travel, and continuously expand more implementable ecosystem services to reshape the connection mode between people and information and accelerate the development of the entire ecosystem.