Baidu released an autonomous driving dataset with a data volume that is "10 times larger than similar datasets". Let's see what it has.
Text | New Intelligent Driving
Report from Leiphone.com (leiphone-sz)
Baidu's Apollo autonomous driving platform, which has always been open and constantly attracting new members, recently "broke the norm" and took the initiative to announce that it had joined the University of California, Berkeley's DeepDrive autonomous driving industry alliance.
On March 8th, US time, Baidu announced that the Apollo autonomous driving open platform officially joined the DeepDrive deep learning autonomous driving industry alliance and released the Apollo autonomous driving data set ApolloScape.
What attracted Baidu Apollo platform to actively join this industry alliance is probably the latter's richer academic achievements and industrial resources in autonomous driving.
If you know enough about UC Berkeley (University of California, Berkeley), you will know that DeepDrive is one of the two major laboratories related to automotive intelligence at UC Berkeley (the other is InterACT).
DeepDrive's research results do not remain in the laboratory, but are closely integrated with the industry. Its current partners include first-tier suppliers such as Bosch and ZF, automakers such as Volkswagen, Honda, and Hyundai, chip manufacturers such as NXP and NVIDIA, and Chinese companies such as Huawei and UISEE.
*Partner in the Deep Drive research project
The DeepDrive Deep Learning Autonomous Driving Industry Alliance is an industry alliance led by the University of California, Berkeley, that researches cutting-edge computer vision and machine learning technologies for use in the automotive field.
Its members include 20 of the world's top companies in the field of autonomous driving, including NVIDIA, Qualcomm, GM, and Ford. Its research projects cover key areas of autonomous driving, such as perception, planning and decision-making, and deep learning.
Baidu's purpose in joining this alliance is to strengthen its autonomous driving research and development capabilities by working with the world's leading autonomous driving companies and top academic research institutions to share research results, accelerate the technological innovation and application process of autonomous driving.
ApolloScape: The data volume is more than 10 times that of similar datasets
Another highlight of this release is the ApolloScape dataset opened by Baidu.
Datasets are generally divided into two categories: one is the general dataset, which is a dataset proposed by the pure computer vision field. This type of dataset is only because it has the element of "car"; the other is the autonomous driving dataset, which includes not only computer vision information, but also IMU, GPS, etc. For example, KITTI is currently the largest computer vision algorithm evaluation dataset in the world for autonomous driving scenarios, and its status cannot be underestimated.
Obviously, Baidu also hopes to build ApolloScape into such a dataset. So, what are the highlights of the ApolloScape dataset?
Baidu believes that massive, high-quality real data is the indispensable "raw material" in the development and testing of autonomous driving. Therefore, the data volume of ApolloScape is more than 10 times that of similar data sets (such as Cityscapes).
The data volume includes: perception, simulation scenes, road network data, and hundreds of thousands of frames of pixel-by-pixel semantic segmentation and annotation of high-resolution image data. Baidu said that from the perspective of data difficulty, the ApolloScape dataset covers more complex road conditions. An example is up to 162 vehicles or 80 pedestrians in a single image.
In addition, this open dataset uses pixel-by-pixel semantic segmentation and annotation. Baidu claims that this is "the most complex, most accurately annotated, and largest autonomous driving dataset currently available."
ApolloScape Annotated Data Example
ApolloScape Depth Data Example
Comparison of Kitti, CityScapes and ApolloScape on data instances
Another feature of ApolloScape is that it contains hundreds of thousands of frames of high-resolution image data with pixel-by-pixel semantic segmentation annotations.
To help researchers better utilize the value of the dataset, Baidu has defined a total of 26 data instances of different semantic items in the dataset (such as cars, bicycles, pedestrians, buildings, street lights, etc.), and will further cover more complex environments, weather and traffic conditions.
Information about each instance contained in the data
Simulation is also a key project of this dataset. Baidu's goal is to create a simulation platform with the highest degree of restoration of the real world and the richest scenes.
According to Leifeng.com, based on the Apollo simulation platform, ApolloScape plans to put dozens of autonomous driving vehicles into the same road network. By simulating real complex driving scenarios and multi-vehicle game processes, it will help R&D personnel effectively test and optimize prediction, decision-making, path planning and other algorithms, and improve the diversity of autonomous driving tests.
In order to revitalize this dataset and attract more developers to use the ApolloScape dataset, during this year's CVPR, Baidu Apollo will jointly host a Workshop on Autonomous Driving with the University of California, Berkeley, hoping to provide a platform for technological breakthroughs and application innovation for global autonomous driving developers and researchers.
"Big system" and "small module"
In the past, a common problem faced by computer vision has been that old algorithms do not work on new datasets.
"We claim to have solved a problem, but we have only solved a data set. This does not mean that we have truly solved the problem, and this happens all the time," the CTO of a domestic autonomous driving company told Leiphone.com.
For example, we can break down the “big system” of autonomous driving into 100 small computer vision problems.
But there are two points worth pondering here: first, we don’t know which of these 100 problems is more important; second, we don’t know which problem we need to solve and to what extent before we can claim that we have completely solved the problem of autonomous driving.
Therefore, how to solve the problem between the "big system" and the "small module" of autonomous driving is the next advantage that Baidu ApolloScape dataset needs to establish, and it is also the autonomous driving dataset needed by practitioners and developers.
PS: To see the technological frontiers and business explorations in the field of intelligent driving in China and the world this year, Leifeng.com recommends new intelligent driving for the year. Scan the QR code above or click Read the original article For more information