"People are used to dividing everything into black and white, but unfortunately, reality is all gray." This sentence written by Liu Cixin is also a true portrayal of the autonomous driving industry. Two schools of thought, focusing on perception and focusing on maps, talk about Taoism in Huashan Mountains, and undercurrents hit the water. But right now, there is no optimal solution to completely rid the car of human intervention.
Because no matter what kind of shortcut, building a smart car is the only way. Especially when autonomous driving scenarios extend from highways to urban roads, improving the vehicle's perception and cognitive capabilities will become increasingly critical.
On the one hand, despite the high lack of information, the map is always changing. For example, the road topology changes in Beijing reached an average of 5.06 points per 100 kilometers in half a year, and there are an average of two road diversion constructions a day in Guangzhou. Only by collecting and transmitting data non-stop can the map be kept fresh. degree; on the other hand, road participants are disordered and random. In addition to vehicles, uncertain factors such as pedestrians and non-motorized vehicles have also become a major test for the advancement of autonomous driving.
Wu Xinzhou, vice president of autonomous driving at Xpeng Motors, once said bluntly, “Compared with high-speed NGP, if we want to use a number to explain it, urban NGP may be more than a hundred times more difficult.” However, to achieve large-scale mass production of autonomous driving, it must go through urban NGP. The road level.
The "9981 Difficulty" of Training Data
Up to now, many cities in China, including Beijing, Chongqing, Wuhan, Shenzhen, Guangzhou, and Changsha, have allowed commercial trial operations of self-driving vehicles in specific areas and specific periods of time. Not long ago, Beijing issued a road test permit for the "unmanned vehicle remote stage".
The unmanned testing of autonomous driving has moved from "people in the co-driver" and "unmanned in the front and people in the back" to the third stage - "remotely outside the vehicle". An eternal theme is to use continuous data to polish the autonomous driving perception model. The model determines the upper limit of functionality, and data is the source driving force. The primary question is, how to obtain more valuable training data at a lower cost and with higher efficiency?
Image source: Tianfeng Securities
It may sound a bit incredible. Just take data labeling as an example. In the past, the common practice in the industry was to label a single frame of 2D image, that is, labeling one frame per second. However, a real video consists of more than 10 frames of images per second. In other words, there are Many gaps have not been marked, and this part has become a "wasted" resource.
Not only that, as autonomous driving data annotation moves to 4D space (3D space + time dimension), a Clip is equivalent to a short video containing camera and sensor data as the minimum annotation unit, making manual annotation more difficult.
A research report from Tianfeng Securities shows that autonomous driving at the L3 level and above requires a large amount of 3D point cloud data support. It not only requires real-time processing and analysis of data returned by sensors, but also requires a large number of curved lane lines, accumulated consumption and damage, etc. The problem of shape and reflectivity distortion also brings great challenges to the accuracy of recognition.
Therefore, if these discrete frames are expanded into Clip form, the cost of manual annotation and rework will inevitably increase the cost of autonomous driving model training. This is also the key reason why Tesla has gone from outsourcing data annotation to establishing its own manual annotation team to promoting automated annotation. Domestic car companies such as Xiaopeng have also built a fully automatic labeling system, which has increased efficiency by nearly 45,000 times. The labeling task that used to take 2,000 people a year can now be completed in about 16.7 days.
In addition to car companies, autonomous driving companies are also actively trying, including Haomo Zhixing, which launched a large video self-supervision model based on the data intelligence system MANA. To simply understand, use image masks to mask certain areas of the video, give the previous frame, let the model guess the next frame, and learn to perform feature extraction independently.
Image source: Haomo Zhixing
Then give the fully annotated Clip to the model for fine-tuning. Repeatedly, the accuracy and precision of the model are improved based on deep learning algorithms. Through video self-supervision of large models, HaoMo Zhixing reduced the cost of clip annotation by 98%. At the same time, given that the large model run on the server has higher generalization, after the training is completed and deployed on the vehicle-side automatic driving platform, the prediction ability will be stronger.
However, these alone are not enough. At this stage, autonomous driving’s desire for data is far from over. Rich data distribution is the prerequisite for training and optimizing autonomous driving perception models.
When it comes to building an autonomous driving system, whether it is pre-collection of data by a collection vehicle or data re-injection by a mass-produced vehicle, there are still long development cycles and high costs. Therefore, simulation technology is regarded as an accelerator for autonomous driving development and is widely used in the industry. Usually, autonomous driving systems need to undergo a large number of simulation tests before they are installed in vehicles for mass production.
However, Ai Rui, vice president of technology at HaoMo Zhixing, pointed out that judging from the different characteristics of each sensor, there is still a lot of room for improvement in current simulation technology. For example, the noise floor of lidar is generally lower than that of millimeter-wave radar, and the two have very different requirements for conditions such as rain, snow, and fog, making modeling in the same scene more difficult.
"It's like watching a movie. No matter how well-done CG animation is, it can still be distinguished from real scenes." Compared with over-reliance on simulation technology, Millimeter Zhixing is interested in using low-cost general scene generation to achieve high-cost Advantages of corner cases.
This is also the fundamental reason why HaoMo Zhixing introduced NeRF (Neural Radiance Fields) technology in 3D reconstruction of large models. NeRF is a 3D reconstruction technology that started in 2020. With the feature of being able to synthesize a 360-degree surround viewing angle with just a few pictures, it quickly became popular in the e-commerce field.
In the field of autonomous driving, NeRF not only helps reconstruct scene data, but also adjusts the corresponding perspective. In this way, vehicle driving in extreme road conditions can be simulated and comprehensive coverage of long-tail scenarios can be achieved. In addition, you can also simulate light adjustments, night effects, etc. to generate the required data.
After adding the data generated by NeRF, Haomo Zhixing reduced the perceptual error rate by at least 30% on the original basis. The more data, the better. The key is not only the vertical “quantity”, but also the horizontal “richness”. Facing the mountain of data, accumulation is the only way out.
Tesla has a fleet of one million, Xpeng has a fleet of 100,000, and Haomo Zhixing relies on the brand scale of Great Wall Motors. By the end of 2022, its cumulative mileage has exceeded 25 million kilometers. There are nearly 20 models equipped with HPilot system, and the monthly installation growth rate exceeds 200%. It is expected that by the first half of 2024, Haomo will complete HPilot’s plan to launch HPilot in 100 cities in China.
Autonomous driving "entering the city" is more difficult to recognize than to perceive
Exercising perception capabilities from big data is the first step to achieve the goal of autonomous driving. Not only that, Tsinghua University professor Deng Zhidong pointed out in an interview with domestic media that one of the core technical difficulties of autonomous driving is how the car understands complex dynamic driving scenarios (DDS) to ensure the safety of autonomous driving.
According to him, human driving is based on cognitive understanding and relies on intelligible visual perception and the brain to achieve decision-making. In contrast, it is difficult for autonomous vehicles to acquire human-level driving perception, prediction and cognitive decision-making capabilities in complex dynamic environments.
Earlier, Haomo Zhixing launched the surround perception algorithm (BEV) based on the transformer model, and gradually applied it to actual roads. However, CEO Gu Weihao also pointed out that after the BEV solution is put on the vehicle, the detection effect of lane lines and common obstacles is relatively good, and the detection range and measurement accuracy under various complex working conditions have also been significantly improved. But there are still some difficult challenges left, especially the problem of stable detection of various special-shaped obstacles on urban roads by visual solutions.
There are generally two solutions: expanding the semantic whitelist. Taking tire identification as an example, it is necessary to collect a large amount of tire information and expand the labeling sample capacity. This method is often time-consuming and laborious; in contrast, a more general method may be able to get twice the result with half the effort. That is, there is no need to understand what the obstacle is. Based on information such as height, it is judged that if it affects traffic, avoid or go around it.
Haimo has launched a large multi-modal mutual supervision model and a large dynamic environment model for this purpose. The former uses the different characteristics of cameras, lidar, millimeter-wave radar and other sensors to supervise each other to identify general obstacles or general structures. The latter is somewhat similar to a large video self-supervision model, and its purpose is to enhance the system's perception capabilities.
Previous article:TSMC is preparing to mass-produce 2nm chips in 2025, and the shortage of automotive chips will be further alleviated
Next article:From smart driving to cockpit to manufacturing, NVIDIA is fully committed to the automotive track
- Popular Resources
- Popular amplifiers
- A new chapter in Great Wall Motors R&D: solid-state battery technology leads the future
- Naxin Micro provides full-scenario GaN driver IC solutions
- Interpreting Huawei’s new solid-state battery patent, will it challenge CATL in 2030?
- Are pure electric/plug-in hybrid vehicles going crazy? A Chinese company has launched the world's first -40℃ dischargeable hybrid battery that is not afraid of cold
- How much do you know about intelligent driving domain control: low-end and mid-end models are accelerating their introduction, with integrated driving and parking solutions accounting for the majority
- Foresight Launches Six Advanced Stereo Sensor Suite to Revolutionize Industrial and Automotive 3D Perception
- OPTIMA launches new ORANGETOP QH6 lithium battery to adapt to extreme temperature conditions
- Allegro MicroSystems Introduces Advanced Magnetic and Inductive Position Sensing Solutions
- TDK launches second generation 6-axis IMU for automotive safety applications
- LED chemical incompatibility test to see which chemicals LEDs can be used with
- Application of ARM9 hardware coprocessor on WinCE embedded motherboard
- What are the key points for selecting rotor flowmeter?
- LM317 high power charger circuit
- A brief analysis of Embest's application and development of embedded medical devices
- Single-phase RC protection circuit
- stm32 PVD programmable voltage monitor
- Introduction and measurement of edge trigger and level trigger of 51 single chip microcomputer
- Improved design of Linux system software shell protection technology
- What to do if the ABB robot protection device stops
- Analysis of the application of several common contact parts in high-voltage connectors of new energy vehicles
- Wiring harness durability test and contact voltage drop test method
- From probes to power supplies, Tektronix is leading the way in comprehensive innovation in power electronics testing
- From probes to power supplies, Tektronix is leading the way in comprehensive innovation in power electronics testing
- Sn-doped CuO nanostructure-based ethanol gas sensor for real-time drunk driving detection in vehicles
- Design considerations for automotive battery wiring harness
- Do you know all the various motors commonly used in automotive electronics?
- What are the functions of the Internet of Vehicles? What are the uses and benefits of the Internet of Vehicles?
- Power Inverter - A critical safety system for electric vehicles
- Analysis of the information security mechanism of AUTOSAR, the automotive embedded software framework
- Last few hours: NXP i.MX answer questions and win AI-IoT series books, 10 books left, not all of them are mine~
- EEWORLD University Hall - Sharing of household heating and cooling inverter air conditioner application solutions based on Lingdong MM32SPIN series MCU
- Passband gain
- Steps and precautions for live load verification of relay protection devices
- Answer the questions to win prizes! ADI Application Tour - Battery Management and Smart Energy Storage
- Huawei HR complained in real name: R&D brother worked overtime for 160 hours a month but was said to be not working hard
- LPS22HH has problems setting Autozero
- [Analysis of the topic of the college electronic competition] - 2015 National Competition Topic E "80MHz~100MHz Spectrum Analyzer"
- EEWORLD University - Teach you how to learn STM32-M4 series video
- "Playboard" + STM32F030Disco drives LCD5110 display