Introduction:
On July 26, the "SenseTime Jueying Autonomous Driving Technology Open Course" jointly planned and launched by SenseTime Jueying and Zhidongxi Open Course was successfully concluded. Dr. Wang Zhe, Director of SenseTime Jueying, gave a live lecture on the theme of "Construction of General Target Perception System for Autonomous Driving".
Dr. Wang Zhe first analyzed the three major challenges in building a general target perception system for autonomous driving. He then explained the construction of SenseTime Jueying's general perception capabilities from the three dimensions of data, algorithms, and computing power, and shared the practical application of SenseTime Jueying's general perception capabilities using a tow truck as an example.
This open class is divided into two parts: the main lecture and the Q&A session. Click to read the original text to watch the full live broadcast replay. This article is a review of Dr. Wang Zhe's lecture
Thank you all for joining today's live broadcast. It is my honor to represent our team and share this live broadcast. This project is called General Object Perception, or GOP.
The topic of today's lecture is "Construction of General Target Perception System for Autonomous Driving", which will be mainly divided into the following four aspects:
1. Challenges in building a universal target perception system
2. Three dimensions of problem solving
3. Actual combat and results
4. What we are doing
1 Challenges in Building a General Purpose Perception System
Let’s start by asking a question: Why do we need general purpose perception?
As can be seen from the above figure, in recent years, as autonomous driving technology continues to be implemented, some mass-produced models have begun to be delivered. In the evolution trend of assisted driving functions of L2+ and L3 intelligent driving cars, the complexity of the scenarios is increasing, from single scenarios to multiple scenarios, from highways to urban areas.
As the coverage of scenarios increases, especially after autonomous vehicles enter urban areas, the challenges they face are also increasing. For example, in urban areas, you may encounter various vehicles performing tasks. In the past, you only needed to know that these were cars, but after entering the city, if you know their respective models and what tasks they are doing, you can better adjust your driving strategy.
In addition, construction scenes are often encountered in urban areas. In various construction scenes, there may be various forms of traffic warning objects, and the perception of these long-tail signs needs to be well covered. There are also various forms of traffic lights in urban areas, including various blocked and truncated traffic lights, or some complex scenes have multiple traffic lights that need to be matched with the map, or some temporary traffic lights, etc. These examples pose a great challenge to the perception algorithm, so it is necessary to identify very rich semantic elements and cover targets of different forms.
We divide the challenges of general object perception into three aspects:
First of all, the goals are open sets, that is, when an autonomous vehicle is driving on the road, the interactive goals are an open set, which means that it is impossible to enumerate or pre-set in advance what kind of objects will be encountered today.
The solutions to this type of problem are: first, are there some online algorithms on the vehicle side that can deal with it, or at least be able to identify such objects and cooperate with multi-sensor solutions for detection and obstacle avoidance? Second, autonomous driving vehicles must be a process of continuous dynamic upgrading and iteration. After encountering an open set of objects, can it iterate a better algorithm in a very short time to solve the detection and tracking problems caused by this type of problem?
The second is that the semantic level of the target will gradually become more refined, from only traffic participants at the beginning to increasingly fine-grained semantic labels.
The third is that the semantic elements of each category must have a long-tail distribution, that is, most targets may look similar, but there are always some that are slightly different. These targets are called long-tail distributions, and these distributions pose greater challenges to perception algorithms.
Let’s look at some examples of these three challenges.
First, the target category is an open set. As shown in the figure above, these are some real scenes that our autonomous driving team encountered during road testing, including rocks falling on the road, pennant strings, tarpaulins on the ground, flying plastic bags, dogs running on the road, and even birds flying in the sky.
Generally speaking, these objects are not targets that are defined or detected by autonomous driving or academia. There are two main difficulties: first, these categories are not exhaustive and are very complex. If you try, you will find that you have no clue and cannot sort out a top-down labeling system to manage and maintain these categories; second, these categories appear relatively rarely, so a very large data base is needed to mine such categories.
The second point is that the semantic level needs to be continuously refined. We know that when we first started to do assisted driving, such as LKA, FCW and other functions, we may only need to simply identify some traffic participants, such as knowing that there are pedestrians, vehicles, and non-motor vehicles in front, and then do some obstacle avoidance or simple vehicle handling tasks. But if there is a higher-level semantic level, such as being able to further distinguish between large freight vehicles, privileged vehicles, and small vehicles in motor vehicles, the driving strategies adopted for each type of vehicle are different, and different models can adopt different avoidance strategies.
Privileged vehicles can be further subdivided. my country's Road Safety Law stipulates four types of privileged vehicles, including police cars, ambulances, fire trucks, and road rescue vehicles. These four types of vehicles can ignore traffic signals and fail to obey traffic rules. Even on roads with speed limits, they can ignore the rules. They are a type of vehicle with relatively strong road rights.
In addition, we have added some categories, such as school buses, which are usually used to pick up and drop off students. So what are the characteristics of a school bus? There is a warning sign near the door for getting off the bus. It is folded when the school bus is driving, and the warning sign will unfold when the students get off the bus, so the shape of the bus itself will change. If the perception algorithm can recognize that it is a school bus, it can make better strategic adjustments to the driving behavior.
The third is that the target shape presents a long-tail distribution. The above picture shows pictures of the same type of vehicles in different shapes, including tractors and other agricultural machinery vehicles, trailers, transport cars, vans, garbage trucks, police cars, etc., as well as some vehicles carrying prominent targets, rescue vehicles, fire trucks, etc. They are all vehicles, but their shapes are very different.
So how to solve these three challenges?
2 Three dimensions of problem solving
The above problems are solved from three dimensions: data dimension, algorithm dimension and computing power dimension.
From the data dimension, we first solve the open set problem. This is a problem that all autonomous driving companies cannot avoid: how to understand and plan all the goals encountered in driving scenarios? Based on the current domain knowledge, we divide the goals that need to be interacted in autonomous driving into four categories.
The four major categories are traffic participants, traffic facilities, animals and other obstacles on the road.
Why is it summarized like this?
Traffic participants are some of the highest-level objects in autonomous driving. We need to be especially careful to avoid collisions with them because these objects are human-involved, usually some intelligent entities, such as bicycles or vehicles driven by people. These objects must avoid personal injury. These traffic participants have a certain rationality and their own value functions. We usually detect and track them, and even downstream modules need to make certain predictions for each traffic participant, including their behavior predictions and trajectory predictions, so these objects have a very high priority.
The second category is the various traffic facilities that appear on the road. We understand that traffic facilities are facilities with certain functional attributes defined in the traffic scene, such as lane lines, traffic lights, traffic platforms, traffic police objects, etc. These traffic facilities generally define the structure of a road and define the drivable area, and can tell everyone what traffic rules there are and what traffic signals need to be followed within this feasible area.
Traffic warning objects are typical. They are obstacles in themselves, such as a water barrier or an ice cream bucket. If they are placed in front of you, you need to go around them. At the same time, they also serve as traffic warning objects, indicating that there must be some construction scenes or accident scenes that need attention near this area. That is, they have some functions in themselves, so they are collectively called traffic facilities.
The third largest category is animals. Why are animals classified as a separate category? We believe that animals are naturally moving objects. In the autonomous driving scenario, moving and non-moving objects are very important for downstream decision-making and planning. If it is a moving object, its speed and state of motion may need to be estimated. Therefore, animals are classified as a separate category. Animals can move, but they do not understand traffic rules and may cross the road at random, so this type of object is different from traffic participants.
Previous article:ESC hydraulic brake pressure sensor based on metal thick film technology
Next article:Innovusion's integrated ultra-long-range AI LiDAR debuts to empower Baidu's "Smart Road OS" ecosystem construction
- Popular Resources
- Popular amplifiers
- A new chapter in Great Wall Motors R&D: solid-state battery technology leads the future
- Naxin Micro provides full-scenario GaN driver IC solutions
- Interpreting Huawei’s new solid-state battery patent, will it challenge CATL in 2030?
- Are pure electric/plug-in hybrid vehicles going crazy? A Chinese company has launched the world's first -40℃ dischargeable hybrid battery that is not afraid of cold
- How much do you know about intelligent driving domain control: low-end and mid-end models are accelerating their introduction, with integrated driving and parking solutions accounting for the majority
- Foresight Launches Six Advanced Stereo Sensor Suite to Revolutionize Industrial and Automotive 3D Perception
- OPTIMA launches new ORANGETOP QH6 lithium battery to adapt to extreme temperature conditions
- Allegro MicroSystems Introduces Advanced Magnetic and Inductive Position Sensing Solutions
- TDK launches second generation 6-axis IMU for automotive safety applications
- LED chemical incompatibility test to see which chemicals LEDs can be used with
- Application of ARM9 hardware coprocessor on WinCE embedded motherboard
- What are the key points for selecting rotor flowmeter?
- LM317 high power charger circuit
- A brief analysis of Embest's application and development of embedded medical devices
- Single-phase RC protection circuit
- stm32 PVD programmable voltage monitor
- Introduction and measurement of edge trigger and level trigger of 51 single chip microcomputer
- Improved design of Linux system software shell protection technology
- What to do if the ABB robot protection device stops
- Wiring harness durability test and contact voltage drop test method
- From probes to power supplies, Tektronix is leading the way in comprehensive innovation in power electronics testing
- From probes to power supplies, Tektronix is leading the way in comprehensive innovation in power electronics testing
- Sn-doped CuO nanostructure-based ethanol gas sensor for real-time drunk driving detection in vehicles
- Design considerations for automotive battery wiring harness
- Do you know all the various motors commonly used in automotive electronics?
- What are the functions of the Internet of Vehicles? What are the uses and benefits of the Internet of Vehicles?
- Power Inverter - A critical safety system for electric vehicles
- Analysis of the information security mechanism of AUTOSAR, the automotive embedded software framework
- Brief Analysis of Automotive Ethernet Test Content and Test Methods
- [Shanghai Hangxin ACM32F070 development board + touch function evaluation board] Development environment construction and initial experience
- 520, Give Tang Monk a Wife
- [Evaluation of domestic FPGA Gaoyun GW1N series development board]——(1) Development environment
- Talk about differential signal
- DSP code optimization instructions in CCS
- Help, what kind of lights are these? I see them for the first time.
- Analysis and improvement of errors in making circuit board laminate structure drawings
- RL78 MCU big and small end problem
- Share a flash tool for esp32-c3
- [Mil MYD-YA15XC-T Review] + USB UVC Camera Test