HomeArticle

Tesla's latest technology sharing: the core architecture of FSD is exposed.

智能车参考2025-10-22 15:58
Is Tesla also using VLA?

Is Tesla also using VLA?

After many years, Tesla finally conducted another technology sharing. At the top computer vision conference ICCV (International Conference on Computer Vision), the core architecture of FSD was exposed. One detail in it sparked heated discussions in the industry, leading to the speculation that Tesla is also using VLA.

Regarding VLA and the world model, the most talked - about technology route dispute in 2025, is there finally an answer picked by Musk?

Is Tesla also using VLA?

Ashok Elluswamy, the vice - president of Tesla's autopilot, recently served as a guest and delivered a keynote speech titled "Building Foundational Models for Robotics at Tesla" during ICCV.

This is the first time Tesla has publicly shared its technology externally after three years. The last time dates back to Tesla AI Day in 2022, when Elluswamy also participated in the sharing and proposed a new paradigm for the occupancy network, leading the industry transformation.

However, the content of this sharing has not been made public yet. Only one PPT has leaked, but it contains a lot of information and has sparked extensive discussions.

From the blurry image, it can be seen that the title of this PPT is "Interpretability and Safety Assurance", both of which are important topics in current autopilot.

Below the image, the core architecture of FSD is shown. Currently, FSD has been integrated into a large neural network that can input multi - modal information. The information involved in the figure includes camera video, navigation information, the vehicle's own motion state, and sound.

The output end includes panoramic segmentation information, 3D occupancy network, 3D Gaussian rendering, language, and the output information that Tesla doesn't want to disclose, represented by an ellipsis. Finally, after inference, the actions are output.

The information shown and implied in this figure coincides with some current cognitions of Chinese players, mainly in two aspects.

The Dispute between VLA and the World Model: Moving towards the Large - Model Stage

First of all, the current architecture of Tesla finally outputs language information, which has led many people to speculate. As we all know, in addition to the differences in sensor selection this year, there has also been a new debate in the software algorithm route, namely the dispute between VLA and the world model.

The former is represented by DeepRoute.ai and Li Auto, while the latter is represented by Huawei and NIO. Some players also believe that the two should be combined.

Players of VLA believe that on the one hand, this paradigm can apply the existing massive data on the Internet to accumulate rich common sense and then understand the world. On the other hand, through language ability, the model actually has the ability of a thinking chain, which can understand long - time - series data and conduct inferences.

Some practitioners even said that some players who don't follow the VLA route lack the supply of large - computing - power chips and thus cannot carry large models.

Players of the world model insist that the world model is closer to the essence of the problem. Jin Yuzhi, the CEO of Huawei's Intelligent Automotive Solution BU, believes that "The path like VLA seems like a shortcut but cannot truly lead to autopilot."

Ren Shaoqing, a well - known AI scientist and the vice - president of NIO's intelligent driving, also said in a recent interview that the world model has a "higher bandwidth" than VLA in spatio - temporal cognitive ability, which means it can identify and utilize more information.

However, he also recognizes that language is very important at present and has great value in data training, logical reasoning, and human - machine interaction.

The same is true when looking at the exposed information of Tesla. Language has very important applications in autopilot. Some people think that Tesla is also using VLA technology, while others say that this may just be that Tesla recognizes the signs on the road and then converts them into language. What do you think? Welcome to discuss in the comments section.

Secondly, based on the currently exposed information, Tesla's FSD has entered the large - model stage, and the parameter scale is still expanding. Previously, at the VLA press conference, DeepRoute.ai believed that the industry has now entered the large - model era. Models with larger numbers of parameters require chips with greater computing power to support them. So, you can see that many new cars with 1000 TOPS or even 2000 TOPS have emerged in the over - 200,000 - yuan segment this year.

It is said that the computing power of Tesla's new - generation intelligent assisted - driving chip that is about to be installed in cars will reach 2000 TOPS, and the model parameters will also be increased by ten times. Its algorithm capabilities are worth looking forward to.

Perhaps being confident enough in its future capabilities, FSD recently restarted the aggressive mode.

Latest Update of FSD: These Changes

FSD has pushed the V14.1.3 version, with as many as 10 updates. It is safer and more "human - like".

In terms of safety, the lateral avoidance ability when encountering small obstacles such as branches, tires, and boxes has been optimized. It also handles unprotected turns, lane changes, and vehicle cut - ins better.

The self - cleaning function of the front - facing camera is faster and more efficient. If there are residues on the front windshield that affect the visibility of the front - facing camera, an alarm will now be issued, and you can contact the service department.

In terms of personalization, before driving, you can more precisely customize your driving preferences through the speed profile. When FSD encounters special vehicles on the road, such as police cars, ambulances, and fire trucks, it has added the actions of pulling over or yielding.

If there is a traffic jam, now the navigation and routing functions have been added to FSD's neural network, which can process and generate detour routes in real - time.

After reaching the destination, you can also choose the parking location, such as parking on the side of the road or entering the parking lot.

A few days before this minor - version update, FSD also restarted the Mad Max Mode. The overall driving style is very aggressive. Video clips shared by some car owners show that in this mode, FSD maneuvered through the evening traffic in Los Angeles, changing lanes and cutting in wildly. It seems to have a higher commuting efficiency than taking a helicopter.

After seeing this, Elluswamy also recommended two scenarios suitable for this mode, such as when you are about to miss your flight or in a hurry to pick up your child from school.

Elluswamy's update on his social platform stopped the day before the ICCV speech. So, what magic Tesla's FSD V14 is using now after the occupancy network and end - to - end is still a mystery.

What is known is that after Tesla stopped sharing externally, Chinese players are still making continuous breakthroughs. Whether it is VLA or the world model, they are all exploring in uncharted territory.

Even if Tesla chooses one of the directions, it doesn't mean it is the standard answer. As He Xiaopeng said, "In fact, any powerful AI player in China has long stopped caring about what Musk is doing."

After the vehicle itself bid farewell to the worship of BBA, China's autopilot algorithms are also saying goodbye to the worship of Tesla.

This article is from the WeChat official account "Intelligent Vehicle Reference" (ID: AI4Auto), author: Yifan. It is published by 36Kr with authorization.