End-to-end big models catalyze the intelligent driving revolution. How will intelligent driving chips evolve?-EEWORLD

Collect

Recently, Lu Jianfeng, vice president of technology of Aixin Yuanzhi's automotive business unit, was invited to attend the 2024 China Automotive Forum sub-forum and elaborated on his thoughts on the development of high-end intelligent driving from the perspective of a chip company.

He believes that the performance of the end-to-end technology route far exceeds the effect of rule-based regulation and control, and gave a very clear judgment: the end-to-end high-level intelligent driving solution is the only way. In his speech, he also analyzed that the application of end-to-end technology is behind the improvement of Tesla's FSD capabilities, and proposed the development stage theory of ADAS1.0-ADAS4.0. The industry is currently at a critical moment of breaking through to the 4.0 stage.

To achieve end-to-end autonomous driving, it is inseparable from the support of key computing chips, including architectural innovation, core IP breakthroughs and performance leaps. As a chip company, Aixin Yuanzhi has launched a series of intelligent driving chip products that adapt to the evolution of intelligent driving algorithm architecture, and provides a rich development tool chain to empower all partners and car companies.

Aixin Yuanzhi is the fastest and most efficient domestic intelligent driving chip supplier in mass production, enabling efficient development for automakers. It is also the second largest intelligent driving chip manufacturer in China in terms of shipments, with shipments reaching hundreds of thousands of pieces. Its automaker customers include new car manufacturers, mainstream joint venture automakers, and domestic top-level brands.

Share content:

End-to-end big models catalyze the intelligent driving revolution. How will intelligent driving chips evolve?

Last November, Tesla released the V12 version of FSD, which performed amazingly. Compared with the previous intelligent driving system, the V12 version has a strong driving generalization ability and an anthropomorphic driving style. The takeover mileage has also been greatly improved. Looking at the right side, the takeover mileage of the V12 version has been greatly increased compared to the previous version. More importantly, the proportion of urban working conditions included in it is also very high.

End-to-end big models catalyze the intelligent driving revolution. How will intelligent driving chips evolve?

Critical Disengagement —— Safety/critical takeover

% of CriticalDE —— No critical takeover rate, no safe takeover rate

The second figure shows the success rate of non-serious takeovers from point A to point B, and the success rate of no takeovers at all. It can be seen that the success rate has been steadily increasing with the iteration of versions; especially in version V12.3, the proportion of no takeovers at all has risen rapidly. From 47% in version V11 to 70% in version V12, the user experience has also improved significantly.

Therefore, it can be seen that the performance of the end-to-end technology route far exceeds the rule-based regulation and control effect.

End-to-end big models catalyze the intelligent driving revolution. How will intelligent driving chips evolve?

In the middle of last year, the Shanghai Artificial Intelligence Laboratory published the first best paper on autonomous driving in the history of CVPR. It mainly introduced UniAD, the first large-scale autonomous driving model that integrates perception and decision-making. This algorithm solution has completely replaced the traditional rule-based algorithm. It has demonstrated great advantages in multiple indicators of perception, prediction, and planning of public data sets. Tesla FSD V12 version is also similar to this algorithm solution, which can be said to be very amazing.

As the end-to-end technology paradigm becomes more popular, academic institutions and car companies jointly released DriveVLM at the beginning of this year, further introducing the generative large model module. This algorithm solution is also very interesting.

In this system, there are two subsystems. The main system is a perception and decision-making integrated system similar to UniAD, which acts as the cerebellum. That is, relatively speaking, it has fast response, good real-time performance, and tries to be unconscious, just like shifting gears in a manual car. It can be said that the Tesla FSD V12 version is designed to replace the role of the human cerebellum.

In addition, there is a large visual language model based on generative AI to act as the brain. This system may react slowly. In other words, it is less real-time. However, when faced with very complex scenarios, such as sudden emergencies, complex traffic or unfamiliar roads, drivers need to pay more attention to make thoughtful decisions.

Most of the time when we are driving, we don’t think actively. Most of the time, we use habits and “subconsciously”. When we encounter special situations, our brains will start to think about reasoning and judgment, and artificially deal with long-tail problems. This is the interesting part of the DriveVLM algorithm.

Finally, in May this year, in the end-to-end boom, Wayve, an autonomous driving company, received $1.05 billion in financing, the largest single financing for a British AI company in history. As a leading company conducting embodied intelligence research in the field of autonomous driving, the two generations of architectures released by Wayve, GAIA and LINGO, correspond to UniAD's end-to-end architecture and visual language large model architecture, respectively, which are highly consistent with the academic community's understanding of the direction of autonomous driving. This case is actually to echo the mainstream of the above two algorithm solutions.

With the implementation of end-to-end technology, it will also have a certain impact on the current intelligent driving solutions. Here is a summary.

End-to-end big models catalyze the intelligent driving revolution. How will intelligent driving chips evolve?

First of all, there are many ways to classify intelligent driving solutions. There are classifications from a functional perspective, such as L2, L2+, L2.5, L2.9, etc.; of course, there are other classification methods.

Aixin Yuanzhi is a chip manufacturer, so our perspective is more about hardware form. So we divide it according to the load, that is, the number of sensors, especially the number of cameras. Specifically, there are three types: all-in-one machine, medium-sized domain controller, and large domain controller. However, due to the intervention of end-to-end technology, there are already four obvious technical stages.

First of all, in the ADAS1.0 stage, the solution is actually an all-in-one machine. In this solution, only the perception algorithm is based on AI. There are even many traditional CV algorithms in it. Of course, this is enough for the current all-in-one machine positioning. We chip manufacturers will also continue to invest in products in this solution.

Then, in the ADAS2.0 stage, the solution is actually a medium-sized domain control of 5V to 7V. In this solution, the perception algorithm is currently based on AI. However, in terms of specific technical routes, it is gradually converging from discrete perception and cross-camera tracking to BEV/Transformer technology routes to ensure better perception effects with limited sensors.

The biggest change is currently in the 11V-13V large-scale domain control. In the past two years, the high-speed and urban NOA, that is, the ADAS3.0 stage, has added detection of occupied networks and general obstacles in terms of perception algorithms, and incorporated laser radar features, greatly improving perception accuracy. However, in terms of regulation and control, rule-based is still the mainstream.

By the end of last year, with the release of Tesla FSD V12, which is the ADAS4.0 stage, the end-to-end trend of large-scale domain control has become very obvious, and the entire autonomous driving technology paradigm is also rapidly iterating and upgrading. Whether it is a modular algorithm solution similar to UniAD, or an eVLM solution with the concept of fast and slow systems that integrates generative AI, it will have a significant impact on chip design.

refer to:

【Autonomous driving solutions have been rapidly iterating over the past few years and have gone through multiple stages: We believe that from the early 1.0 to 2.0 stages, from discrete perception based on CNN to BEV perception with transformer structure, efficiency has been improved; in the 3.0 stage, the occupancy network was introduced, and the lidar features were integrated, which further improved the perception accuracy. At the same time, AI planners gradually replaced the traditional rule-based planning scheme in the planning and control module. However, so far, the transmission from perception to regulation and control is still the defined display interface, that is, the coordinate position of the obstacle frame and lane line is transmitted; in the end-to-end period, that is, the 4.0 stage, the feature features of the model are transmitted between different modules, just like the UniAD we mentioned earlier, and the K and V transmission is used between different "former" modules, which maximizes the retention of information transmission and reduces the loss. In the latter stage of end-to-end, the introduction of the visual language large model, that is, the generative end-to-end large model, can better solve complex scenes and truly "explain" autonomous driving to a certain extent. Is the introduction of the fast and slow system the final solution for autonomous driving? (At least for now, the end-to-end solution has reached a certain degree of consensus among practitioners and researchers)]

The above introduces the algorithm features in four different ADAS stages. Aixin Yuanzhi has also iterated different versions of NPU according to different stages to cope with the evolution of technology trends.

End-to-end big models catalyze the intelligent driving revolution. How will intelligent driving chips evolve?

Among them, it is important to emphasize that we introduced the third-generation NPU architecture that supports BEV/Transformer in the ADAS2.0 stage, that is, the 5V-7V intelligent driving solution, to further improve the performance of the system. At the same time, in the fifth-generation NPU, an end-to-end algorithm solution similar to the FSD V12 version is provided.