How do driverless cars learn to drive step by step?-EEWORLD

Collect

Similar to how humans use their eyes to observe the road and their hands to control the steering wheel, self-driving cars use a row of cameras to perceive the environment and deep learning models to guide driving. Generally speaking, this process is divided into five steps:

Recording environmental data
Analyze and process data
Build models that understand the environment
Training the model
Refine models that can be improved over time

If you want to understand the principles of driverless cars, this article is not to be missed.

Recording environmental data

An unmanned vehicle first needs to have the ability to record environmental data.

Specifically, our goal is to get an even distribution of left and right steering angles. This is not difficult to do and can be achieved by circling the test track in clockwise and counterclockwise directions. This training helps reduce steering deviation and avoids the embarrassing situation where the car slowly drifts from one side of the road to the other side after driving for a long time.

Additionally, driving at a slow speed (e.g., 10 miles per hour) also helps record smooth steering angles while turning. Here, driving behaviors are classified as:

Straight line driving: 0<=X<0.2
Small turn: 0.2<=X<0.4
Sharp turn: X>=0.4
Restore to Center

Where X is the steering angle, r is the turning radius (in meters), and the formula for calculating the steering angle is X=1/r. The "return to center" mentioned above is very important in the data recording process. It helps the vehicle learn to return to the center of the lane when it is about to hit a road cliff. These recorded data are saved in driving_log.csv, where each line contains:

File path to the center camera image in front of the lens
File path to the front left camera image
File path to the front right camera image
Steering angle

In the process of recording environmental data, we need to record images of about 100,000 steering angles to provide enough data for training the model and avoid overfitting due to insufficient sample data. By regularly drawing a histogram of the steering angles during the data recording process, we can check whether the steering angles are symmetrically distributed.

Analyze and process data

The second step is to analyze and prepare the data just recorded for building the model. The goal at this point is to generate more training samples for the model.

The following picture is taken by the front center camera with a resolution of 320*160 pixels and contains red, green and blue channels. In Python, it can be represented as a three-dimensional array where each pixel value ranges from 0 to 255.

The area below the driver's line of sight and the lane markings on both sides have always been the focus of research in autonomous driving technology. These two parts can be cropped using Cropping2D in Keras to reduce the noise input to the model.

We can use the open source computer vision library OpenCV to read the image from the file and then flip it along the vertical axis to generate a new sample. OpenCV is well suited for self-driving car use cases because it is written in C++. Other image augmentation techniques like tilt and rotation can also help generate more training samples.

Additionally, you need to flip its steering angle by multiplying it by -1.0.

Afterwards, the image can be reshaped into a three-dimensional array using the Numpy open source library to facilitate the next step of modeling.

Build models that understand the environment

After the image data is obtained, we need to build a deep learning model for the unmanned vehicle to understand the environmental information and extract features from the recorded images.

Specifically, our goal is to map an input image containing 153,600 pixels to an output containing a single floating point value. Each layer of the model proposed by NVIDIA provides specific functions and should work well as a basic architecture.

Nvidia model related paper address: https://arxiv.org/pdf/1604.07316v1.pdf

Afterwards, we need to normalize the 3D array to unit length to prevent large values from being biased in the model. Note that we divide it by 255.0 because that is the maximum possible value for a pixel.

It is also necessary to appropriately reduce the pixels of the scene in front of the vehicle below the human field of view and the image above the front of the vehicle to reduce noise.

After that, we need to convolve the 3D array such as lane markings to extract key features, which are crucial for predicting the steering angle.

We want the developed model to be able to handle any type of road, so we need to use dropout to reduce overfitting.

Finally, we need to output the steering angle as a float.

Training the model

After building the model, we need to train the model to learn to drive on its own.

From a technical point of view, the goal at this stage is to predict the steering angle as accurately as possible. Here, we define the loss as the mean squared error between the predicted and actual steering angles.

Randomly draw samples from driving_log.csv to reduce order bias.

We can set 80% of the samples as training set and 20% as validation set, so that we can see how accurate the model is in predicting steering angles.

After that, we need to use Adam (Adaptive Moment Estimation) to minimize the mean squared error. Compared with gradient descent, one of the advantages of Adam is that it borrows the concept of momentum from physics to converge to the global optimum.

Finally, we use the generator to fit the model. Due to the large number of images, we cannot feed the entire training set into the internal training at once. Therefore, we need to use the generator to batch produce images for training.

Refining the model over time

Refining the model is our final step, which is to improve the accuracy and robustness of the model over time. Our experiments use different architectures and hyperparameters to observe their effects on reducing the mean squared error. What kind of model is best? Sorry, there is no unified answer to this question, because most improvements require sacrificing something else, such as:

Use better graphics processing units (GPUs) to reduce training time, but be aware that this will increase costs

By reducing the training time, the learning rate is reduced, which increases the probability of converging to the optimal value.

By using grayscale images, you can reduce training time. However, you should note that this will lose the color information provided by the red, green, and blue channels.

The accuracy of gradient estimation is improved by using a larger batch size, at the expense of memory usage.

A large number of samples are used at each stage to reduce the volatility of losses

Looking at the full text, we can actually find that the process of developing self-driving cars is also a process for us to understand the advantages and limitations of computer vision and deep learning.

Reference address：How do driverless cars learn to drive step by step?

Previous article：Renesas Electronics and Dibotics Launch Real-Time, Low-Power LiDAR Solution Based on R-Car SoC
Next article：Tesla's 15-minute fast charging technology revealed, what magic does it use?

Popular Resources
Popular amplifiers