Exclusive Interview with the CEO of Timekettle: Unafraid of Apple's Entry, the AI Translation Earphone Market Is Far from the Endgame
Author: Ou Xue
Editor: Yuan Silai
When tech giants want to combine software and hardware, a simple move on their part can shatter the ground beneath many startups' feet.
In 2025, when the accuracy of AI translation exceeded 95%, AI translation earphones might be the hardware category most vulnerable to being gobbled up by tech giants. They don't need to make any changes at the hardware level; they just need to integrate software functions.
In the past year or two, almost every Android device manufacturer has added translation functions to their earphones. However, it wasn't until the early morning of September 10th when Apple announced that AirPods would introduce real - time translation functions that entrepreneurs in the translation earphone industry truly felt the chill - Apple alone accounts for 23% of the global TWS market share.
Translation is an obvious need and will sooner or later become a standard feature in all major manufacturers' earphones. This impact is not unexpected.
Actually, around 2023, after AI translation software became free, the market was flooded with white - label products. After independent brands that didn't want to engage in price wars managed to stand out, they had to face the competition from industry giants.
Shenzhen - based Timekettle is one of the hardware manufacturers that has persevered. Amid the intensifying competition in the translation industry, Timekettle's revenue reached 200 million yuan in 2024, and its cumulative revenue continued to grow significantly in 2025. It has ranked first in the industry for six consecutive years, with over one million cumulative users covering more than 170 countries around the world.
Tian Li, the founder of Timekettle, hails from Huawei. Perhaps due to his past industry experience, Timekettle has always adhered to the principle of prioritizing hardware. Its translation earphones are priced at around 1,000 yuan and can achieve "simultaneous speaking and translating" through hardware, striving to restore a natural conversation experience.
In the face of Apple's AI features, Tian Li, the founder of Timekettle, has his own strategies. "The logic of tech giants is to add translation functions to their existing ecosystems," said Tian Li, the founder of Timekettle. "Our logic is to reconstruct the entire cross - language communication system."
This is not a challenge unique to Timekettle. "Looking at the entire translation industry, the industry's maturity is still insufficient," Tian Li said. In this era of niche markets, they must find a differentiated positioning among numerous niches.
Tian Li believes that currently, AI mostly performs literal translations, and only about 10% to 20% can understand cultural contexts and translate naturally. AI represents the peak of technological rationality. It can handle basic translations, but language requires emotions to add more meaning. A translation product that people will use continuously cannot merely have tool - like attributes.
"Language barriers are an ancient and widespread problem, and the profession of translation has existed for thousands of years," Tian Li said. "Human translators often don't translate word - for - word; instead, they understand your intention and express it accurately."
Regarding this, large companies may not have the patience to make continuous and meticulous optimizations. This also presents an opportunity for startups.
We had a conversation with Tian Li to discuss the current market competition and the future direction of AI earphones.
The Industry Is Still in Its Early Stages
Yingke: How do you view the current competition in the AI translation field?
Tian Li: To be honest, we don't pay much attention to the competition in the industry. The most fundamental reason is that I think the entire industry is still far from mature. Comparing who is better at this stage is like "comparing the bad," which doesn't make much sense.
Currently, the intensity of competition in the industry is not high. However, as a company that is relatively ahead in the industry, we do face more imitations. For example, we were the first to introduce modes such as two - way simultaneous interpretation and external - speaker conversations.
Our core focus has always been on solving problems, especially eliminating people's communication barriers. Language barriers are an ancient and widespread problem, and the profession of translation has existed for thousands of years. Solving this problem is extremely valuable. We don't want to blindly chase after trends. Instead, we hope that one day, users won't remember whether we make earphones or hardware but will know that we can help them overcome language barriers.
Yingke: How do you view the entry of tech giants like Google and Apple into the translation field? Do you feel pressured?
Tian Li: First of all, I'd like to reiterate that we don't pay much attention to competition, because the industry is indeed not mature enough. Secondly, the logic of tech giants is to "add functions" to their existing ecosystems (for example, Apple adding translation functions to AirPods), while our logic is to "reconstruct the entire communication system." Different starting points will lead to different efficiencies and experiences in the final solutions.
Yingke: What is Timekettle's core competitive advantage compared to its competitors?
Tian Li: I think it lies in the level of perception. Our focus is on the communication scenario itself, rather than a specific hardware form. From the very beginning, we didn't simply focus on the earphone form. Instead, we considered whether the interaction experience of earphones could enable two people to communicate more naturally. If it couldn't, we would modify the earphones. That's why our previous generations of products didn't even use Bluetooth earphone chips because our focus was never on traditional functions like making phone calls. Over time, this underlying logic will form a moat in terms of brand and user perception.
W4 AI Simultaneous Interpretation Earphone - Bone Voiceprint Recognition Technology (Source: Company)
Yingke: Timekettle's overseas revenue accounts for 70%. What unexpected challenges have you encountered when adapting to the cultures, languages, and policies of different markets?
Tian Li: The most challenging part is understanding users. In China, a large number of usage scenarios are related to business negotiations. However, in the United States, as a country of immigrants, about 20% of the population are immigrants, and about 50% of them, more than 20 million people, have limited English proficiency. Many people use our products in their cross - border immigrant lives, which might seem novel in China but is quite common abroad.
Yingke: How is the company's current financial situation? Is it profitable?
Tian Li: The company has achieved profitability. We don't deliberately pursue rapid revenue growth. Instead, we focus more on maintaining healthy profitability and product maturity. In the early stage of the industry, pursuing excessive growth can have a negative impact.
Yingke: Currently, many translation devices rely on cloud - based large models. However, there is also a strong trend of "edge - side AI" in the industry. Will Timekettle consider deploying "edge - side AI"?
Tian Li: Edge - side AI is very important to us because network conditions are not ideal in many places. People have high expectations for our products. Once the network is poor, two people cannot communicate and will be in a difficult situation. Therefore, edge - side AI is an area we must deploy in.
Yingke: Will you consider other multimodal interaction methods besides "voice" in the future? For example, gestures and tones?
Tian Li: That's a bold idea. For some companies making smart glasses, it's quite difficult to understand users' intentions through gestures because gestures are non - standardized.
However, I think tones can be relatively standardized. For example, angry, inquisitive, confused, or happy tones can gradually be standardized. This is also a research direction within our company.
AI Translation Is Still Far from Natural
Yingke: How do you think the relationship between AI translation and human translation will develop in the future? How does Timekettle define this relationship?
Tian Li: We classify AI translation into levels L1 to L5, similar to the classification of autonomous driving. I think the current level that most can achieve is around L3. In contrast, human translators, especially high - level ones, are undoubtedly at level L4 or even L5.
The biggest difference between L3 and L4/L5, in my opinion, lies in the level of trust. L3 is just usable, while L4 and L5 are levels where you can truly trust the system to handle communication and problem - solving for you.
The improvement in this area involves two aspects. Firstly, it needs to have empathy like a human being, be able to understand and convey what you really want to express, rather than just doing literal translations. Human translators often work this way, understanding your intention and expressing it accurately. This is the key dividing line between L3 and L4. Secondly, there are still many details to be refined in the product's interaction experience.
Yingke: How long do you think it will take to make the leap from L3 to L4?
Tian Li: Optimistically, I think it might take 2 to 3 years. This is because two aspects need to evolve simultaneously. The former is an area that the entire industry is promoting and is a branch of the large - scale intelligent system. The latter requires product companies to carefully consider how to further improve the user experience.
Yingke: In terms of user experience, there are many cultural metaphors and idioms in different languages. How does Timekettle try to understand and preserve these "implications"?
Tian Li: I think this is a problem faced by the entire industry, and it's difficult for a single company to solve it all on its own. Currently, what we can do is collect more language data. For example, we are making more attempts in languages such as Chinese - English and Spanish. In the long run, large companies like Google will do some fundamental work, and we can build on their achievements and focus on the details we are better at.
Yingke: How much do you think AI translation currently understands culture?
Tian Li: I think it might only be 10% to 20% currently. Most efforts are still focused on literal translations. However, I believe that in about half a year, you will see significant improvements in our products.
For example, our products now have two modes: basic AI translation and large - model translation. From a technical perspective, these are different technical paths. But from a consumer's perspective, they care about whether the translation can understand their intentions or is just a rigid, unchanging translation. You will see a significant difference between these two modes in the future.
Yingke: How do you view and define the responsibility for "hallucinations" (random and inaccurate translations) or errors in translation?
Tian Li: To be honest, in the United States, our products are also used in some medical institutions. However, the definition of relevant responsibilities is still unclear.
Just like the legal definition of accidents involving autonomous vehicles is still unclear, currently, our products are mostly at the L3 level and are mainly used in an auxiliary capacity. However, AI translation is not as serious as autonomous driving because autonomous driving involves life - and - death issues.
W4Pro AI Simultaneous Interpretation Earphone for Business Negotiations (Source: Company)
Yingke: What will the next - generation AI translation device look like? Does Timekettle have any latest research directions to share?
Tian Li: The most ideal future product is very simple: it should enable natural communication like we have now. However, we all know that we are still far from this goal, and this is what we are currently working hard towards.
Currently, we are focusing on two main technical directions. The first is sound collection quality, which should be accurate even in complex environments. The biggest difference from traditional earphones is that traditional earphones emphasize clean sound collection, while we not only require cleanliness but also high - quality sound so that the machine can better recognize and translate after the sound is transmitted to the cloud. The second is the intelligence level of AI itself, such as supporting more minority languages and performing better in terms of empathy and more accurate semantic translation.
In the next 2 - 3 years, the most core and overriding goal is to upgrade the product experience from L3 to L4. We believe that once this goal is achieved, the company's financial indicators such as performance and profit will no longer be a problem.