At the just-concluded CVPR, isee, Peking University, UCLA, and MIT jointly released a research result called Multi-Agent Tensor Fusion (MATF). The model encodes the past trajectories and scenes of multiple agents into a multi-agent tensor, and then applies convolutional fusion to capture the interactions between multiple agents while retaining the spatial structure and scenes of the agents. The model uses adversarial loss to learn random predictions. Experiments on datasets of highway scenes and pedestrian congestion scenes show that the model has achieved state-of-the-art prediction accuracy.
Driving is a social activity. Consider this impressive multi-agent social interaction in this scene (with a headache-inducing roundabout):
Drivers are driving in a complex scenario while remaining largely safe. It is remarkable that human drivers can maintain high-probability traffic safety when driving or interacting closely with other road users in the same environment and when they cannot fully understand the driving intentions of other vehicles. So how do human drivers accomplish this feat?
Social prediction is an essential part of driving
Human drivers use their social intelligence to predict how other traffic participants’ future actions will depend on their interactions with themselves and the scene. By predicting the trajectories of nearby traffic participants, drivers can proactively plan safe interactions and minimize other emergency responses such as braking when an unexpected situation is about to occur.
However, a human driver can never predict with complete certainty what trajectory another vehicle will execute. A human driver is often in a situation where they are thinking, “Will he yield?” “Will he suddenly speed up?” “How slow will he go?”
Learn to predict
The researchers developed a neural network architecture that can learn from large-scale data to make probabilistic predictions about other trajectories. The researchers' approach only considers training data collected during driving, generalizing as much as possible across environments, scenarios, and types of vehicles and agents (trucks, cars, buses, motorcycles, bicycles, pedestrians, etc.).
iess, together with Peking University, University of Southern California, and Massachusetts Institute of Technology, developed a new method called Multi-Agent Tensor Fusion (MATF). By aligning scene features and agent trajectory features in a multi-agent tensor (MAT) representation, it combines the advantages of spatial and agent-centric representations, as shown below. MAT encoding naturally handles scenes with different numbers of agents through convolution operations, and predicts the computational complexity of the trajectories of all agents in the scene is linear. GAN training allows MATF to learn to predict the distribution of trajectories that capture the uncertainty of how the situation will develop. MATF learns to predict joint trajectories, which can explain interactive behaviors such as deceleration and avoidance between vehicles.
Here is a detailed description of the MATF architecture. The MATF architecture first encodes all relevant information about the scene, and then processes the past trajectories of each agent using a recurrent neural network to encode all relevant information about each agent. The network then spatially aligns the scene and agent features into a multi-agent tensor, preserving all local and non-local spatial relations in the scene. Multi-agent tensor fusion is then performed using the learned fully convolutional mappings to obtain the fused multi-agent tensor as the final encoding of the multi-agent driven scene. The convolutional mapping is the same for each agent, it captures the spatial relationships and interactions between all agents, and is applicable to all agents in the scene at the same time. The MATF method then learns probabilistic decoding information from the fused multi-agent tensor to produce predicted trajectories that are sensitive to scene features and the trajectories of surrounding agents.
We use a conditional Generative Adversarial Network (GAN) training technique to learn a probability distribution over trajectories given a MATF encoding. GANs allow learning high-fidelity generative models that capture the distribution of observed data. In a driving context, the modes of the distribution correspond to different maneuvers that a vehicle or pedestrian may perform, such as following a lane/path and changing lanes/paths. The distribution around each mode corresponds to how the maneuver is performed, such as fast, slow, aggressive, cautious, etc. GANs naturally capture both types of variability. Importantly, our GAN algorithm trains the model to generate articulated trajectories that account for interactions between vehicles, such as yielding and collision avoidance.
in conclusion
The researchers first applied their model to learn to predict vehicle trajectories (where large-scale driving data was collected by isee). The figure below shows five scenarios, with each vehicle's past trajectory shown in a different color, followed by 100 sampled future trajectories. The ground truth trajectories are shown in black, and the lane centers are shown in gray. (a) shows a complex scenario involving five vehicles; MATF accurately predicts the trajectories and velocity profiles of all vehicles. In (b), MATF correctly predicts that the red vehicle will complete the lane change. In (c), MATF captures the uncertainty of whether the red vehicle will take the highway exit. In (d), once the purple vehicle passes the highway exit, MATF predicts that it will not pass the highway exit. In (e), MATF fails to accurately predict the ground truth trajectory of the red vehicle; however, it predicts that the vehicle will initiate a lane change maneuver in a small number of sampled trajectories, reflecting the low prior probability of spontaneous lane changes learned from the dataset.
Next, the researchers applied their model to learning to predict the trajectories of pedestrians and multiple other types of agents from the Stanford Drone Dataset, a large, state-of-the-art dataset containing trajectories of pedestrians, cyclists, skateboarders, carts, cars, and buses traveling around a university campus. In the figure below, the blue line shows the past trajectory, the red line shows the ground truth trajectory, and the green line shows the predicted trajectory. The trajectories of all the agents shown in the figure were jointly predicted by a forward pass through the network. The model predicts: (1) two agents entering the roundabout from the top will exit from the left; (2) an agent coming from the left on the pathway above the roundabout turns left and moves toward the top of the image; and (3) a speed bump slows down at the doorway of the building above and on the right side of the roundabout. Another interesting but failed example (4) shows an agent at the top right corner of the roundabout turning right to move to the top of the image; the model predicts the turn but fails to accurately predict the turn angle.
Previous article:Ministry of Industry and Information Technology: New energy vehicles face both opportunities and challenges
Next article:AI technology still has a long way to go to achieve intelligent driverless driving
- Popular Resources
- Popular amplifiers
- Huawei's Strategic Department Director Gai Gang: The cumulative installed base of open source Euler operating system exceeds 10 million sets
- Analysis of the application of several common contact parts in high-voltage connectors of new energy vehicles
- Wiring harness durability test and contact voltage drop test method
- Sn-doped CuO nanostructure-based ethanol gas sensor for real-time drunk driving detection in vehicles
- Design considerations for automotive battery wiring harness
- Do you know all the various motors commonly used in automotive electronics?
- What are the functions of the Internet of Vehicles? What are the uses and benefits of the Internet of Vehicles?
- Power Inverter - A critical safety system for electric vehicles
- Analysis of the information security mechanism of AUTOSAR, the automotive embedded software framework
Professor at Beihang University, dedicated to promoting microcontrollers and embedded systems for over 20 years.
- LED chemical incompatibility test to see which chemicals LEDs can be used with
- Application of ARM9 hardware coprocessor on WinCE embedded motherboard
- What are the key points for selecting rotor flowmeter?
- LM317 high power charger circuit
- A brief analysis of Embest's application and development of embedded medical devices
- Single-phase RC protection circuit
- stm32 PVD programmable voltage monitor
- Introduction and measurement of edge trigger and level trigger of 51 single chip microcomputer
- Improved design of Linux system software shell protection technology
- What to do if the ABB robot protection device stops
- Allegro MicroSystems Introduces Advanced Magnetic and Inductive Position Sensing Solutions at Electronica 2024
- Car key in the left hand, liveness detection radar in the right hand, UWB is imperative for cars!
- After a decade of rapid development, domestic CIS has entered the market
- Aegis Dagger Battery + Thor EM-i Super Hybrid, Geely New Energy has thrown out two "king bombs"
- A brief discussion on functional safety - fault, error, and failure
- In the smart car 2.0 cycle, these core industry chains are facing major opportunities!
- The United States and Japan are developing new batteries. CATL faces challenges? How should China's new energy battery industry respond?
- Murata launches high-precision 6-axis inertial sensor for automobiles
- Ford patents pre-charge alarm to help save costs and respond to emergencies
- New real-time microcontroller system from Texas Instruments enables smarter processing in automotive and industrial applications
- Waveform changes of different ANT frequencies
- Regulator tube usage help
- [GD32L233C-START Review] IV. USART
- Electric vehicle charger schematic, file name is chip name
- SD card creative stickers
- How do you entertain yourself while staying at home?
- The difference and relationship between embedded Linux and embedded development of 51/430/STM32
- 【DM642】Porting of H.264 source code on DM642
- [2022 Digi-Key Innovation Design Competition] Material Unboxing
- Prize-giving live broadcast: Book a session on "Meeting the test challenges in 5G signal generation" and win Keysight gifts