Many autonomous driving companies, including Waymo, Baidu, Mercedes-Benz (Cityscape), Nvidia (PilotNet), Honda (H3D), Aptiv (nuScense), Lyft, and Uber, have released some of their training and verification datasets. Some well-known universities have also released their training and verification datasets, including MIT, Cambridge, Oxford, Buckley, California Institute of Technology (Caltech), CMU, University of Sydney, Michigan, Ruhr, Germany (Traffic Light), York, Canada (JAAD), and Stanford. However, Argo AI's dataset Argoverse is still worth mentioning.
On June 2, 2020, Argo AI CEO Bryan Salesky and Reinhard Stolle, vice president of the company's Munich division (AID), jointly issued a statement on their official blog, announcing that Argo AI has gone international. The most critical sign of the completion of this event is that the German Volkswagen Group has officially completed its $2.6 billion investment in Argo AI. Volkswagen will hold the same shares as Ford, and the rest will belong to Argo AI employees. Argo AI's board of directors will also increase from 5 to 7 people, with Volkswagen and Ford each holding 2 seats, and the other 3 seats belonging to Argo AI itself. Argo AI is also different from most autonomous driving companies that are headquartered in California. Its headquarters is in Pittsburgh, the heart of the American automotive industry, not in California's high-tech zone. In the
perception link of autonomous driving, it is necessary not only to detect moving targets, but also to predict the trajectory of moving targets, just like the prediction of human driving. This is true autonomous driving and can improve safety. That is MODT (Moving Object Detection and Tracking). The datasets mentioned above are all for detection, while Argoverse is mainly for 3D trajectory tracking and prediction, which is its uniqueness.
Furthermore, Argoverse combines high-precision maps with 3D trajectory tracking and prediction, and uses deterministic maps to improve the determinism of the overall system. This is what the major autonomous driving manufacturers are concerned about. Traditional car manufacturers pay more attention to determinism and safety. There are only two datasets with high-precision maps, one is Argoverse and the other is nuScense.
Comparison of Argoverse with other datasets
The picture above shows Argo AI's data collection vehicle. The lidar part uses two VLP-32C stacked together. The scanning density is three times that of nuScense. There are seven 2-megapixel cameras distributed in a ring with a frame rate of 30Hz, a 5-megapixel binocular camera with a frame rate of 5Hz, a baseline width of 29.86 cm, and 6DOF high-precision positioning. This data collection vehicle is also Argo AI's autonomous driving prototype vehicle, and the main collection locations are Pittsburgh and Miami.
The top row is 4 2MP camera images, the middle right is a stereo camera image, the bottom row is 3 2MP camera images, and the right is a vector map of ground height. All sequences are aligned to a map containing lane centerlines (magenta), drivable areas (orange), and ground height. The sequences are annotated with 3D cuboid tracks (green).
Three reference coordinate systems are shown: (1) the vehicle frame, Xv forward, Yv left, Zv up, (2) the camera frame, Xc across the image plane, Yc down the image plane, and Zc along the optical axis, and (3) the LiDAR frame, XL forward, YL left, ZL up. For each frame, positive rotations RX, RY, RZ are defined as rotations about the corresponding axis following the right-hand rule.
3D scene understanding would be much easier if the map directly told us which 3D points belong to the road, which belong to static buildings, which lane the tracked object is in, how far it is to the next intersection, etc., and it would be easier to predict its motion trajectory. However, since publicly available datasets do not contain rich mapping attributes, how to represent and exploit these features is an open research problem. Argoverse is the first large-scale autonomous driving dataset with such detailed maps. Argo AI studies the potential uses of these new map features on two tasks: 3D tracking and motion prediction, and provides a large amount of real-world, annotated data to provide a new benchmark for these problems.
Argoverse scene visualization uses lidar bird's eye view for visualization, vehicles or other targets are marked with 3D boxes, binoculars are used to calculate the drivable area and represent it in cyan, and yellow lines are used to represent the boundaries. It can also be seen that in the future, Volkswagen and Ford's autonomous driving will be similar to Mercedes-Benz, with binoculars as the core, binoculars are used to calculate the drivable area, and lidar positioning and MDOT obstacle avoidance.
Argoverse uses high-precision maps to remove static objects on the ground. With high-precision maps, combined with lidar bird's-eye view, ground-based stationary targets can be accurately removed, reducing the complexity of perception calculations for autonomous driving and making it easier to identify important moving targets. If the ground is sloped, this algorithm is more difficult. Argo AI uses accurate height matching and 3D maps to solve this problem, and the final effect is as shown in the right column. The
Argoverse tracking dataset contains 113 segments with human-annotated 3D trajectories. The length of these 113 segments ranges from 15 to 30 seconds, and contains a total of 11,052 tracked objects. All objects of interest (dynamic and static) are annotated using 3D Bounding Box. Only objects within 5 m of the drivable area defined by the map are annotated. For objects that are invisible for the entire duration of the segment, the track is immediately solidified after the object becomes visible in the LiDAR point cloud, and the track is terminated when the object is no longer visible. The same object ID is used for the same object even if it is temporarily occluded. Each object is labeled with one of 15 categories, including ON_ROAD_OBSTACLE and OTHER_MOVER for static and dynamic objects that do not belong to other predefined categories. More than 70% of the tracked objects are vehicles, and pedestrians, bicycles, mopeds, etc. are also observed.
All annotations were manually annotated to ensure quality. The annotated tracking data was divided into 65 training, 24 validation, and 24 test sequences.
In motion prediction for moving object trajectories, Argoverse predicts the position of a tracked object at some time in the future. The motion of many vehicles is relatively meaningless - in a given frame, most cars are parked or driving at a nearly constant speed. Such trajectories are unlikely to represent real-world prediction challenges. Argo AI wanted a benchmark with a variety of scenarios, such as intersections, lane merging vehicles slowing down, speeding up after a turn, pedestrians stopping on the road, etc. In order to sample enough of these interesting scenarios, Argo AI tracked objects over 1006 driving hours in Miami and Pittsburgh and found behaviors that Argo AI was interested in during 320 of those hours. These mainly included (1) being at an intersection, either (2) turning left or right, (3) turning into an adjacent lane, or (4) in heavy traffic. In total, Argo AI collected 324,557 5-second sequences and used them for the prediction benchmark.
The geographical distribution of these sequences is shown above.
Each sequence contains a 2D bird's-eye view of each tracked object sampled at 10 Hz. The "focal" object in each sequence is always a vehicle, but other tracked objects can be vehicles, pedestrians, or bicycles. Their trajectories can be used as context for the "social" prediction model. The 324,557 sequences are divided into 205,942 training sequences, 39,472 validation and 78,143 test sequences. Each sequence has a challenging trajectory. The training, validation and test sequences are taken from unconnected parts of the city, i.e., approximately one eighth and one quarter of each city are set aside as validation and test data, respectively. This dataset is much larger than what can be mined from publicly available autonomous driving datasets. Data of this scale is attractive because it allows us to see rare behaviors and train complex models, but it is too large to exhaustively validate the accuracy of the mined trajectories, and therefore, there is some noise and error inherent in the data.
Argo AI uses the Baseline Tracker. Given a sequence of F frames, where each frame contains a set of ring camera images and 3D points from LiDARPi with x, y, z coordinates of Pi, we want to determine a set of trajectory hypotheses {Tj | j = 1, …, n}, where n is the number of unique objects in the entire sequence and Tj contains the set of object center positions and orientations. We usually have a dynamic observer because our cars are often driving. Vehicles in the scene can be stationary or moving.
Baseline Tracker. Argo AI's baseline tracking pipeline works within the drivable area of the LiDAR point cloud (marked on the map) to detect potential objects, removes non-vehicle LiDAR information using Mask R-CNN, associates clusters over time using nearest neighbor and Hungarian algorithms, estimates transformations between clusters using iterative closest point (ICP), and estimates vehicle poses using a classic Kalman filter using a constant velocity motion model. All vehicles use the same predefined 3D Bounding Box size.
If no match can be found for an object via the Hungarian method, the motion model is only used for up to 5 frames to maintain the object's pose before the object is removed or associated with a new cluster. This allows the tracker to maintain the same object ID even if the object is occluded and reappears for a short period of time. If a cluster is not associated with a currently tracked object, a new object ID is initialized for it. The tracker
uses the following map properties:
Previous article:LG Chem's electric vehicle battery sales grew 83.7% and is expected to become the world leader this year
Next article:Analysis of the domestic power battery industry in May
Recommended ReadingLatest update time:2024-11-23 03:11
- Popular Resources
- Popular amplifiers
- "Cross-chip" quantum entanglement helps build more powerful quantum computing capabilities
- Why is the vehicle operating system (Vehicle OS) becoming more and more important?
- Car Sensors - A detailed explanation of LiDAR
- Simple differences between automotive (ultrasonic, millimeter wave, laser) radars
- Comprehensive knowledge about automobile circuits
- Introduction of domestic automotive-grade bipolar latch Hall chip CHA44X
- Infineon Technologies and Magneti Marelli to Drive Regional Control Unit Innovation with AURIX™ TC4x MCU Family
- Power of E-band millimeter-wave radar
- Hardware design of power supply system for automobile controller
Professor at Beihang University, dedicated to promoting microcontrollers and embedded systems for over 20 years.
- Intel promotes AI with multi-dimensional efforts in technology, application, and ecology
- ChinaJoy Qualcomm Snapdragon Theme Pavilion takes you to experience the new changes in digital entertainment in the 5G era
- Infineon's latest generation IGBT technology platform enables precise control of speed and position
- Two test methods for LED lighting life
- Don't Let Lightning Induced Surges Scare You
- Application of brushless motor controller ML4425/4426
- Easy identification of LED power supply quality
- World's first integrated photovoltaic solar system completed in Israel
- Sliding window mean filter for avr microcontroller AD conversion
- What does call mean in the detailed explanation of ABB robot programming instructions?
- STMicroelectronics discloses its 2027-2028 financial model and path to achieve its 2030 goals
- 2024 China Automotive Charging and Battery Swapping Ecosystem Conference held in Taiyuan
- State-owned enterprises team up to invest in solid-state battery giant
- The evolution of electronic and electrical architecture is accelerating
- The first! National Automotive Chip Quality Inspection Center established
- BYD releases self-developed automotive chip using 4nm process, with a running score of up to 1.15 million
- GEODNET launches GEO-PULSE, a car GPS navigation device
- Should Chinese car companies develop their own high-computing chips?
- Infineon and Siemens combine embedded automotive software platform with microcontrollers to provide the necessary functions for next-generation SDVs
- Continental launches invisible biometric sensor display to monitor passengers' vital signs
- Help: Transformer detection of low frequency or DC residual current
- Now learning single chip microcomputer is learning its control theory
- How to realize network remote control of balance car? ?
- Help with max30102 heart rate and blood oxygen test module issues
- Espressif releases new products really fast: the first Wi-Fi 6 + Bluetooth 5 (LE) RISC-V SoC is released
- Have you ever encountered these problems: 5 times the salary cannot retain employees?
- November 10th prize-winning live broadcast: Protecting clean water sources - ADI water quality monitoring program registration is open~
- Thermal fatigue analysis of power modules
- FOC sensorless motor control video
- Implementation of a sensorless, low-cost, high-speed, low- and medium-power BLDC controller