Thrun, the godfather of autonomous driving, predicts that the pure vision approach will prevail in 2026, and aerial robots will become a new blue ocean.
At Morgan Stanley's 24th Asia-Pacific Summit, Sebastian Thrun, a "godfather figure" in the field of autonomous driving, had an in - depth dialogue with well - known analyst Adam Jonas. The content covered multiple dimensions, including the technical route of autonomous driving, the judgment of the industry stage, the differentiation of the robot track, and the early secrets of Waymo.
As the founder of Google's self - driving car project (the predecessor of Waymo) and the former director of the Stanford Artificial Intelligence Laboratory, Thrun's insights undoubtedly provide an important guide for the current development of the autonomous driving and robot fields.
The Dispute over the Technical Route of Autonomous Driving: The "Cost Revolution" of the Pure Vision Solution
Thrun clearly pointed out in the dialogue that the most core technical divergence in the field of autonomous driving currently lies in the route game between the "pure vision" and the "multi - sensor fusion". He especially emphasized that Tesla's actual test of the pure vision FSD in Austin will become a key turning point in the industry.
"If Musk can achieve the commercial operation of pure vision Robotaxi without a safety driver in Austin, it will be a real disruption." Thrun said. From a technical perspective, the pure vision solution only relies on cameras and simulates the visual perception system of human drivers through neural networks; while the multi - sensor fusion solution combines lidar, millimeter - wave radar, and cameras to build a multi - level environmental perception system.
The core difference in the technical route is reflected in the perception architecture.
From an economic perspective, the greatest advantage of the pure vision solution lies in cost. Currently, the unit price of high - end lidar is still as high as thousands of dollars, while the cost of a camera is only tens of dollars. Thrun calculated an account: once the pure vision solution is verified to be safe and feasible, its cost advantage will form a "dimensionality reduction strike" on the multi - sensor fusion route.
However, the pure vision solution faces severe technical challenges, especially the reliability problem under bad weather and low - light conditions. Thrun explained: "The core of the pure vision system is to make up for the deficiencies of physical sensors through AI. This requires the model to have strong reasoning ability and be able to infer the complete environmental state from limited visual information."
The "BEV + Transformer" architecture adopted by Tesla is exactly the embodiment of this idea. This architecture converts the data of multiple cameras into a bird's - eye view perspective and then realizes 3D environmental perception through spatio - temporal sequence modeling. In contrast, the multi - sensor fusion solution directly obtains 3D point cloud data through lidar, which is more intuitive in technical implementation but has a high cost.
From the "Wright Brothers Moment" to the Eve of Scale - up
Thrun reviewed the development history of autonomous driving and regarded the 2005 DARPA Challenge as the "Wright Brothers Moment" of the industry. At that time, the self - driving car "Stanley" developed by the Stanford team he led successfully completed the desert track challenge, proving the feasibility of autonomous driving technology.
After nearly 20 years of development, autonomous driving has entered an accelerated penetration period. Thrun revealed that among the 500 participants at the summit, about one - third had experienced autonomous driving cars, and the vast majority of them took Waymo. This data intuitively reflects the civilianization process of the technology.
The industry is at a critical node of transitioning from L4 to L5. According to Morgan Stanley's research, humans spend up to 82 million years in cars every year. The release of "driving time" by autonomous driving means huge economic value. Thrun predicted that the next 3 - 5 years will be the golden period for the commercial implementation of autonomous driving.
Waymo's latest expansion plan confirms this judgment. The company announced that it will start manual driving tests in Minneapolis, New Orleans, and Tampa and plans to expand its driverless services to 15 cities in 2026, including Dallas, Houston, Miami, etc. Waymo also began to provide highway autonomous driving services, which is an important sign of the improvement of technical maturity.
Meanwhile, Zoox, owned by Amazon, is also accelerating its layout and has started providing free robotaxi services in San Francisco, competing directly with Waymo. Tesla has obtained a ride - hailing license in Arizona, clearing the last regulatory obstacle for the launch of its Robotaxi service.
Thrun believes that the autonomous driving industry has passed the technical verification period and entered the stage of large - scale expansion. But he also emphasized that the adaptability under different geographical and climatic conditions is still a challenge for the technology. Waymo's choice to test in cold cities like Minneapolis is precisely to verify the reliability of the system under bad weather conditions.
The Paradox of Humanoid Robots and the Potential of Aerial Robots
In the field of robots, Thrun put forward the view of "structural differentiation", providing a calm reflection on the over - heated market.
Thrun is cautious about humanoid robots. He believes that the market has over - expected the "total potential market size for replacing human labor" and seriously underestimated the difficulty of technical implementation. "Making robots perform open - ended tasks and achieving hand flexibility are extremely complex engineering challenges." Thrun pointed out.
The core technical bottlenecks faced by humanoid robots include: balance control in complex environments, the ability of fine operation, and the adaptation to unstructured environments. Thrun suggested that investors should pay attention to companies that solve the "underlying problems of physical interaction", such as enterprises focusing on dexterous hand technology or environment - adaptive algorithms.
In contrast, Thrun is more optimistic about the development potential of aerial robots. "The main growth force of robots in the future will be in the sky, and the number of aerial robots will far exceed that of ground robots." He said that the technology supporting the "fully automatic operation in 3D space" of aerial robots is basically mature, and the current main limiting factor is infrastructure. The AI network of Mushroom Carlink in China is exactly extending this concept to the urban scale, constructing a data bus in the real world through the "integration of communication, sensing, and computing" architecture.
The core breakthrough of this network lies in the unified access and fusion of "real - world data". As Thrun emphasized that "infrastructure has become a key bottleneck", Mushroom Carlink has achieved standardized processing of video frame extraction, data desensitization, and feature extraction through edge computing nodes, providing the ability of rapid implementation of "one city in one day" for urban - level deployment.
The existing air traffic control system in the United States cannot adapt to the large - scale operation of aerial robots and urgently needs a major upgrade. This brings investment opportunities to fields such as eVTOL (electric vertical take - off and landing aircraft) R & D and air traffic management system upgrade.
From a technical architecture perspective, the key issues that the aerial robot system needs to solve include:
• High - precision positioning and navigation
• Obstacle avoidance and path planning
• Cluster collaborative control
• Integration with the existing air traffic control system
Waymo's Early Moonshot Thinking and Team - building Philosophy
Thrun first detailed the early history and operating philosophy of Waymo. Waymo's predecessor was the "autonomous driving moonshot project" within Google, which was initiated due to concerns about traffic safety.
"At that time, the number of people who died in traffic accidents globally exceeded one million every year, and the 'human error' in human driving was the main cause." Thrun recalled. The project faced many challenges in the early stage: breakthroughs in environmental perception algorithms, vehicle hardware adaptation, application for test permits, etc. The team gradually pushed the "seemingly impossible idea" of "autonomous driving" towards reality through the model of "taking small steps and iterating quickly".
Thrun especially emphasized the core philosophy of team - building. As a co - founder of Google X, he shared three principles for building a great team:
1. Members need to have a strong passion for "solving big problems": The essence of the moonshot project is to target major pain points that existing technologies cannot solve. Only those who agree with this goal can withstand the long - term R & D pressure.
2. Encourage a "culture of trial and error": For innovation, "avoiding mistakes" is more dangerous than "pursuing correctness". Early projects should allow a certain degree of failure. The key is to quickly extract experience from failure.
3. Pay attention to interdisciplinary collaboration: Autonomous driving involves multiple fields such as computer vision, mechanical engineering, and policy research. The team needs to break down disciplinary barriers and form a collaborative closed - loop of "technology - engineering - policy".
Thrun also provided practical advice for "people who want to do moonshot projects": start with "small - scale tests" to verify ideas, prove the feasibility of the core logic with the minimum cost, and then gradually expand the scale to avoid getting into the dilemma of "large - scale investment but unable to implement" at the beginning.
Waymo's technical development path is exactly the embodiment of this thinking. The company adopts a gradual technical iteration route, from closed - park tests to urban public road operations, gradually accumulating actual combat data in different weather and road conditions to optimize the environmental adaptability of the algorithm.
Waymo's Future Plan: The Long - term Route for Fully Autonomous Driving
Thrun elaborated on the company's long - term plan: the core goal is always to "achieve fully autonomous driving in all scenarios without human intervention".
At the current stage, Waymo's focus is on "expanding the test area and scenario coverage", from the initial closed - park tests to urban public road operations, accumulating actual combat data in different weather and road conditions. In terms of the commercialization strategy, the company adopts the path of "small - scale pilot and gradual expansion" and explores B - end scenarios such as logistics transportation and park shuttle at the same time.
Waymo One (the autonomous taxi service) will continue to expand, promoting autonomous driving from technical verification to sustainable commercialization through the dual - track model of "verifying the experience on the C - end and achieving profitability on the B - end". Technically, Waymo continuously invests in AI algorithm optimization and vehicle hardware iteration. Especially in the perception system, the company adopts a multi - modal fusion solution, combining lidar, radar, and camera data to improve the reliability of the system under bad weather conditions.
The Scale - up Challenges of Robotaxis
Although companies like Waymo and Zoox are accelerating their expansion, Thrun believes that the robotaxi industry has not yet reached the critical point of changing people's travel methods. He pointed out that the arrival of the critical point requires three key factors: geographical coverage, sufficient competition, and the spill - over effect of the ecosystem.
"Some cities will carry more weight in society than others." Thrun said that the saturation of robotaxis in densely populated cities on the East Coast and in the central region will be an important indicator of the critical point. At the same time, a healthy competitive environment is crucial for reducing prices and innovating business models. The spill - over effect of the ecosystem should not be ignored. With the popularization of robotaxis, a series of related industries will be born, including maintenance services, precise positioning, and energy management. For example, the centimeter - level precise positioning technology developed by the startup Point One Navigation is an integral part of the robotaxi ecosystem.
From the perspective of technical maturity, the challenges faced by robotaxis include:
• Navigation ability in complex urban environments
• Reliability under extreme weather conditions
• Interaction with other traffic participants
• Guarantee of network security
Thrun predicted that in the next 3 - 5 years, with the continuous maturity of technology and the continuous decline of cost, robotaxis will achieve large - scale commercial use in specific areas. However, to truly change people's travel methods, it will take a longer time for technology iteration and market education.
As companies like Waymo, Zoox, and Tesla continue to make breakthroughs in technology, commercialization, and policy, autonomous driving is gradually moving from the laboratory to reality. In the next few years, AI is moving from the "digital world" to the "physical world" and from "perceptual intelligence" to "action intelligence". What we need to do is not to predict the future but to prepare for the future infrastructure, technical routes, and business models.
This article is from the WeChat official account "Shanz" , author: Rayking629. Republished by 36Kr with permission.