In the era of large models, how many steps are needed for autonomous driving to come to fruition?

Publisher:平静的33号Latest update time:2023-08-28 Source: 智车科技 Reading articles on mobile phones Scan QR code
Read articles on your mobile phone anytime, anywhere

The big model became popular, first sweeping the field of NLP, and then bringing about the vigorous development of AIGC. The next step is about to bring revolutionary changes to the autonomous driving that has been struggling to be implemented for many years. As a car company following the trend, naturally it cannot lag behind in the era of large models: On July 31, Geely previewed large model technology; on August 8, GAC launched the "GAC AI Large Model Platform"; Chery will also release its own AI large model... Prior to this, Xpeng, Ideal, and Tesla all claimed that they had an "automated data closed-loop system" as an application direction of large models.


In addition, platform-level companies, Baidu, Alibaba, Tencent, 360, and Huawei have all launched their own large models. By the end of the first half of the year, there were more than 80 domestic large models. China and the United States account for 80% of the world's large models. Today, when large models are so popular, the implementation of autonomous driving is also accelerating. Of course, it also faces new challenges.


Large models in autonomous driving


In fact, a "model" is a computer program used to build a virtual neural network. Similar to biological neural networks, nerves will become active only when stimulated to a certain extent. If the stimulus is reinforced, activity will be output. This is actually the expression of a multi-segment function. It becomes possible for neural networks to simulate any continuous function. Since the 1980s, these computer concepts have been established and applied to autonomous driving, but there has been no major breakthrough. The fundamental reason lies in the number of parameters. This is an important reason why ChatGPT became popular. OpenAI found that if the number of model parameters reaches a certain level, the system intelligence (model accuracy) will be greatly improved. The principle cannot be explained now, but there is already a word to describe it - "emergence". How many? Generally speaking, it is at least about 100 million. Of course, because the information density of natural language is higher than that of images, the number of parameters of ChatGPT-2 reaches 1.5 billion, while the number of parameters of ChatGPT-3 reaches 175 billion, which is almost the same as the number of human neurons.


The large model in autonomous driving is more complex and is called a multi-modal large model. As a deep learning model that can handle many different types of data, it can integrate data from different sensors and make decisions based on this data. Large multimodal models usually consist of multiple branches, each branch processing different types of data, such as images, text, sounds, videos, etc. These branches can be run in parallel and the results are eventually merged for decision making. Compared with traditional single-modal models, the advantage of multi-modal large models is that it can obtain richer information from multiple data sources, thereby improving model performance and robustness . For example, in the field of autonomous driving, multi-modal large models can simultaneously process data from multiple sensors such as cameras, lidar , and millimeter-wave radar to more fully understand the current traffic environment and make more accurate decisions.


How do large models subvert autonomous driving?


Specifically, how are large models deployed on autonomous vehicles, and which modules of autonomous vehicles can they be deployed on?


The first is the perception part of autonomous driving. As we all know, the perception of autonomous driving requires the fusion of inputs from multiple sensors to complete data fusion and enhancement. In this process, when the detection results of sensors conflict, it becomes a big problem to trust which side's results are more trustworthy. One of the advantages of large models is that they can handle the relationships between different types of data. For example, in autonomous driving, cameras can provide image information of roads and obstacles, lidar can provide distance and depth information, and millimeter-wave radar can provide speed and direction information. Large models can fuse these different types of data together to create a more complete and accurate understanding of the driving environment.


Secondly, it is to complete the automatic labeling and pre-labeling in the target detection task. In the past, supervised learning (artificial training) was used, but now to allow AI to train itself, it is necessary to complete the data closed loop first. This is why several new forces say they have "automated data annotation systems", which are actually a function of large models. The complete data closed loop includes data collection, data reflow, data processing, data annotation, model training, testing and verification. Among them, "data annotation" is the prerequisite for AI self-training and also the cost node of AI training.


The so-called annotation is to label the key information points of the video or image so that the system can understand and make targeted planning in actual operations. Obviously, the scenes collected by mass-produced cars are basically repetitive, and the data is of little significance. Specialized collection vehicles are more expensive (cost is 6,000 yuan to 10,000 yuan per day). The key point is how to collect as many "long-tail scenes" as possible, that is, scenes that are not often encountered, but almost everyone will encounter them after driving many times (accounting for about 5%). Before the large model is put online, it is manually annotated. For 1,000 frames of video data, the cost of manual annotation may reach 10,000 yuan. The most valuable part of large models currently is automated data labeling, which may save hundreds of millions of yuan (depending on how many people are replaced to label data).


In the era of large models, car companies face new challenges


In the era of large models, due to the huge amount of model parameters, the data collected is also growing rapidly. How to apply the driving data that car companies rely on for survival, deploy and train through distributed systems , and finally how to apply it to autonomous driving? Vehicles have become a big problem.


Generally speaking, the work of data closed loop has now been divided among outsourced suppliers, large model platform companies (which can also be regarded as suppliers) and car companies. Very few car companies can completely deploy their own basic large models, handle the application layer themselves , design pre-annotation and implement data closed loop, and then drive algorithm updates. This is the evolution path of L4. Its technical complexity requires full integrated cooperation between car companies and suppliers, rather than the traditional supplier "delivery" - OEM application.


Car companies place too much emphasis on comprehensive control of the value chain and emphasize full-stack self-research, which may delay the iteration process. If a rule is designed, for example, the large base-layer model is designed by a platform-level company, the car company is responsible for mastering the labeling rules and handing over subsequent manual review to another third party. After getting back the labeled data, it can conduct training on its own. Put yourself at the core of the autonomous driving value chain through task decomposition . Avoid being controlled by others on key technologies, and don’t be forced to accept an integrated software and hardware solution from a certain supplier.


All in all, the rise of large models has caused a carnival of new forces mastering intelligent driving. As the author, I always believe that if you want to reach the end of the L4 level of autonomous driving, you must focus on both intelligent driving and bodywork. Tesla has already It points the way for latecomers, and will also widen the gap between new forces and traditional car manufacturers when the era of large models arrives.


Summarize


In the future, the ultimate form of a car must be a walking smart terminal. The popularity of large models has also driven the deployment of smart driving in cars, making L4 and even higher-level autonomous driving possible.


Even though many people believe that corner cases will become obstacles to the development of autonomous driving, as time goes by, more and more extreme scene data are collected and become more and more complete. One day, large models can learn All driving situations, when the era of autonomous driving will truly arrive, smart driving companies at the forefront will also win first-mover advantages.


Reference address:In the era of large models, how many steps are needed for autonomous driving to come to fruition?

Previous article:First in China: Shanghai releases two 5G network standards to support high-level autonomous driving
Next article:Not only new cars, look at the three major trends in smart driving at the Chengdu Auto Show

Latest Automotive Electronics Articles
Change More Related Popular Components

EEWorld
subscription
account

EEWorld
service
account

Automotive
development
circle

About Us Customer Service Contact Information Datasheet Sitemap LatestNews


Room 1530, 15th Floor, Building B, No.18 Zhongguancun Street, Haidian District, Beijing, Postal Code: 100190 China Telephone: 008610 8235 0740

Copyright © 2005-2024 EEWORLD.com.cn, Inc. All rights reserved 京ICP证060456号 京ICP备10001474号-1 电信业务审批[2006]字第258号函 京公网安备 11010802033920号