Zhang Hao, Huolala's CTO: Winning in AI Depends on "Application Scenarios" Rather Than Foundational Models

Operational efficiency and user experience are the two core aspects of Huolala.

In 2025, the business world stands at a crossroads of transformation. Amid the reconstruction of business narratives and the sweeping wave of technology, the WISE2025 Business King Conference, with the theme of "The Scenery Here is Exceptionally Beautiful", aims to identify the certain future of Chinese business amidst uncertainties. Here, we record the opening chapter of this intellectual feast and capture the voices of those who remain steadfast in the face of change.

From November 27th to 28th, the 36Kr WISE2025 Business King Conference, hailed as the "annual barometer of technology and business", was held at the Conduction Space in the 798 Art District, Beijing.

This year, WISE is no longer a traditional industry summit but an immersive experience in the form of "technology-infused short dramas". From AI reshaping the boundaries of hardware to embodied intelligence opening the door to the real world; from brand globalization in the wave of going overseas to traditional industries embracing "cyber prosthetics" - we are not only presenting trends but also capturing the insights gained from numerous business practices.

In the following content, we will dissect the real logic behind these "captivating dramas" frame by frame and witness the "exceptionally beautiful scenery" of business in 2025.

The following is the edited transcript of the speech by Zhang Hao, CTO of Huolala:

Good afternoon, everyone! My name is Zhang Hao, and I'm the CTO of Huolala.

During the round-table discussion just now, there was a topic about which industry would be most profoundly affected by AI at present. Mr. Qin from Sweet Potato Robotics replied that it was not yet clear.

Next, I'd like to share with you the application of AI at Huolala. I won't talk about the future for now, but only our development path in the past two years. After my sharing, please make your own judgment on how much impact AI is having on the industry at present.

You may have seen Huolala's vehicles on the street, so I won't go into details about our business scenario. We were initially founded in Hong Kong and entered the Chinese mainland in 2014. By now, we have a history of 12 years. In addition to China, we also provide services in more than 400 cities and regions in Southeast Asia, South America, etc. We have nearly 20 million active users and 2 million active drivers on average per month. For a business matching platform like ours, the most important thing is to facilitate transactions between shippers and drivers. Therefore, operational efficiency and user experience are Huolala's core capabilities and the two core areas where we need AI to make a difference.

Each company has different business scenarios and implementation stages. Two years ago, with the emergence of ChatGPT, we also started research in this area. The first question we needed to solve was in which areas of our industry and company structure could AI play the most significant role?

We referred to the evaluation method in Goldman Sachs' 2023 AI research report and quantified the potential of AI to improve efficiency through job surveys, task breakdowns, and automation difficulty ratings. Generative AI will first trigger a productivity revolution in high-data-density and labor-intensive fields. Therefore, we prioritized the implementation of AI in scenarios such as business security, R & D, product development, and operations. However, in scenarios with high certainty requirements and low tolerance for errors, such as data analysis, we believe it's not the right time yet.

After determining the development direction, the next issue was the path for technology implementation.

Like all technology companies, when we started in 2023, we wanted to develop a vertical large model for our industry. So we invested resources in developing a large model for the freight industry and spent a lot of effort on it.

Finally, we learned two lessons with our time and money:

First, the development of basic large models is advancing rapidly. Instead of spending a lot of time on basic large models, it's better to focus on implementing our industry's digital assets, business APIs, and industry know - how.

Second, building our own AI application platform is more important than developing a basic large model. As the basic large models improve, our own AI applications will also become more efficient automatically.

With these two insights, we shifted our focus and stopped obsessing over basic large models. We spent about a year and a half building three comprehensive platform applications: the Dolphin Platform, the Wukong Platform, and the Evaluation and Annotation Platform.

I'll briefly introduce these three platforms to you.

The goal of the Wukong Platform is to enable non - professional users to build a basic enterprise intelligent agent application within the company in just five minutes.

It has three major features:

First, visual process orchestration. You can drag and drop to connect the API interfaces of the company's various data assets.

Second, zero - code intelligent construction. You can build some basic intelligent agents through natural language.

Third, it can build an enterprise - level tool library and MCP. As mentioned earlier, our competitiveness doesn't lie in technology but in making good use of the company's digital assets.

The Dolphin Platform is designed for more professional algorithm developers. It aims to improve efficiency in all aspects, from data training to model development, deployment, maintenance, and lifecycle management, in a one - stop manner.

We hope that through this excellent internal company platform, we can save algorithm engineers' time in resource management, data processing, model development, and testing.

Of course, the most important part is the evaluation after the model is built and launched. We launched the Annotation AB Testing Platform and the Lala Intelligence Evaluation system, which improved the perfection of our model PK and AB testing segmentation.

Previously, people joked that "artificial intelligence is as intelligent as the amount of manual work", because a lot of time and energy were spent on various annotations and evaluations. Therefore, building a good evaluation mechanism can ensure that the results of each launch are truly reliable and repeatable.

I briefly introduced our three platforms just now, which we are quite proud of. We learned from our experience that large models are public basic capabilities provided by industries and large companies, but platform applications need to be developed by enterprises themselves. Only by doing this can we move forward more quickly.

Next, I'll share some of our application scenarios. To be honest, none of these scenarios is revolutionary, but I think they are worth promoting and have some minor innovations.

For example, AI - based security prevention and control. In the freight scenario, if you use Huolala, you may notice that there are often some illegal operations, including carrying passengers illegally, transporting dangerous goods, and dangerous driving behaviors.

As a platform, we need to intervene in a timely manner. However, the time window for this is very short, perhaps only a few minutes. If we fail to detect or intervene within these few minutes, problems may occur.

Given the short real - time monitoring window for safe driving and the high accuracy requirements, we use large models, voice, images, and unstructured data for real - time detection and intervention, and conduct hierarchical handling of the entire order - placing process. Over the past year, the number of risky orders for dangerous goods transportation and illegal passenger - carrying has decreased by 30%, and the order reminder rate has reached 100%.

Second, it's inevitable for any technology company using AI Coding to consider whether it can improve the efficiency of product development and R & D. The answer is yes.

Since we started using AI Coding more than a year ago, 90% of individuals and teams are now using it. At the same time, the penetration rate of AI Coding in our entire R & D process, from the initial PRD to R & D, deployment, debugging, and subsequent monitoring, has reached 60%. So the overall penetration rate is relatively high.

However, there are also limitations. We think that at present, AI Coding can only improve work efficiency by about 10%.

If there are engineers here, you know that programmers don't spend all eight hours of a workday writing code. We calculated that engineers spend about 30% of their working days writing code on average. Assuming that 30% of the code is generated by AI, only about 10% of the deployed code is from AI Coding. This proportion is not very high.

We found that AI Coding can generate large sections of code for new projects and front - end tasks. However, for complex business logic, developers need to have many back - and - forth exchanges with AI in natural language to get the correct logic, and there's no guarantee that this code will be deployed.

That is to say, although the time spent on writing code is reduced, the time spent on checking, correcting, and testing increases. So overall, the improvement in efficiency is not very significant.

Another minor innovation is the "Take a Photo to Select a Vehicle" function. New customers may not know which type of vehicle is suitable for their cargo, and some customers may not know the size and weight of their goods. So we developed this function. Users can take a photo of their goods with their cameras, and AI can calculate the volume through point - cloud segmentation and automatically match it with the vehicle models in our database. Usually, it can make the most suitable recommendation in just 10 seconds. The effect of this function has been quite good.

As an Internet company, we receive thousands of user feedbacks every day. These feedbacks are diverse and need to be labeled, classified, and summarized, which is very inefficient. So we used a large language model to build a user feedback analyzer. It uses a small model for quick recognition and classification and then a large model for summarization.

We found that this approach is very effective. For example, we quickly realized that many users complained about the low efficiency of our invoice issuance. Information like this, which might have been overlooked before, can now be accurately captured.

Similarly, after colleagues leave the company or products are iterated, over time, we may forget who developed a certain feature and why it was developed in a certain way. There are a large number of knowledge blind spots.

How to solve this problem? We also used a large language model to collect all our company's PRD documents, code repositories, configurations, etc. Through data analysis, we created an AI product knowledge expert, which can help solve many historical problems, especially problems related to knowledge management and cross - departmental collaboration.

In our business process, there is a large demand for sending text messages to users. Text messages are not cheap, and there is a lot of room for cost savings.

Since the content of text messages is written by humans, there is room for improvement. A large language model is naturally suitable for this task. Through intelligent optimization and analysis, we can simplify and optimize the previously inaccurate and overly long text message content, which helps us save about 12% of the text message cost per year.

There is also a potential benefit, which is risk prevention. Since a single text message may be sent to millions or even tens of millions of users, a large language model can predict potential risks in terms of word - choice and content compliance in advance, enabling us to intervene in a timely manner.

With the development of AGI technology, digital humans are becoming ubiquitous. Previously, pure text - or voice - based assistants lacked a human - like image. Now we use real - looking AI digital humans as business partners, which have been well - applied both in internal and external company scenarios.

For example, when our AI application experts were on the phone, there were often problems such as not understanding dialects and giving irrelevant answers. Moreover, when communicating with external parties, even if the answers were correct, people were often less likely to trust them because they knew the responder was an AI.

How to solve this? We developed a three - stage connection mechanism of AI + ASR+LDM + TTS.

Through our unique hot - word operation and optimization of the acoustic model provided by third - party partners, ASR achieved a semantic recognition accuracy of 94%.

More importantly, we found that adding a certain dialect to the AI business partner could make it more realistic. So we adjusted the accent and timbre, making the AI's human - likeness reach 92%, which is quite good.

In addition, in online scenarios, users often have emotions. For example, when asking AI customer service, they may be anxious or angry. In such cases, we need to soothe them in a timely manner and guide them to different scenarios. We used a large language model for question rewriting, scenario routing, and the Multi - Agent approach, which significantly improved the problem - solving rate and accuracy.

To conclude, let me answer the question I raised earlier. We believe that in many service - oriented business scenarios, AI basically does two things: increasing revenue and reducing costs.

Each industry is different. In our O2O industry, whether it's e - commerce or freight, the essence is still the service industry, and the core value is still the service itself, which cannot be replaced by AI. If self - driving and embodied intelligence are 100% popularized in the future, AI may be able to handle this work, but as of now, it can't. In our industry or similar industries, the efficiency - improvement ability of AI is still marginal, about 5% - 10%. Some positions may be more affected, but overall, it's a way to improve efficiency, prevent risks, and reduce costs.

Next, we should remain optimistic. I think that first of all, the evolution of basic large models is advancing rapidly and has exponential development potential. Many problems that exist today may no longer be problems in three months.

In terms of specific implementation, we hope to move towards a multi - modal model solution in the future. As I mentioned earlier, our AI business partner has three stages: ASR, LLM and TTS. However, it's still quite difficult for us to maintain its accuracy and latency.

Therefore, integrating a single model and end - to - end three - module approach will be our future direction. Currently, our individual digital humans are doing well, but in the future, we hope to connect the entire process from upstream to downstream and use multiple digital humans to improve the overall efficiency of the enterprise process.

Of course, the most important thing is the user experience. I didn't mention it much earlier because we think its current impact is relatively small. However, in the future, we hope that with the improvement of AI capabilities, an end - to - end large - model assistant can improve efficiency in areas such as intelligent vehicle selection, intelligent form filling, internal operations, and answering questions.

This article is originally produced by「富充」， For reprint or content cooperation, please click Reprint Instructions ；Unauthorized reprint will be held accountable.

Zhang Hao, CTO of Huolala: The key to winning in AI lies not in the foundational models, but in the "application scenarios".