A brief analysis of robot autonomous positioning and navigation based on SLAM-EEWORLD

Collect

Background: Autonomous positioning and navigation

As shown in the figure, the robot's autonomous positioning and navigation technology includes two parts: localization and mapping (SLAM) and path planning and motion control. SLAM itself only completes the robot's positioning and mapping, and the two are different. First, let's clarify the two major solutions for autonomous navigation:

1. Traditional solution: SLAM+path planning+motion control;

2.: In the past two years, deep learning has become synonymous with the industry. In the past few years, the industry has used more traditional probabilistic or cybernetics methods to conduct autonomous positioning and navigation of robots. However, what is shown here is that deep learning is used directly as input for camera data, and then the control signal of the robot is directly generated. The SLAM process and path planning process can be achieved completely through learning.

1. SLAM

SLAM is the abbreviation of simultaneous localization and mapping, which was first proposed by Hugh Durrant-Why and John J. Leonard. In fact, SLAM is more like a concept rather than a single concept. It contains many steps, each of which can be implemented using different algorithms. It is mainly used to solve the problem of real-time positioning and mapping of mobile robots when they are running in unknown environments. When you are in a foreign place, how do you find the place you want to go accurately? When you get lost outdoors, how do you find your way home? Yes, we have navigation software and outdoor maps.

1. Select a map

Just like humans draw maps, robots mainly rely on maps to describe and understand the environment. They use environmental maps to describe their current environmental information, and use different map description forms depending on the algorithms used. There are four ways to represent maps in robotics: grid maps, feature maps, direct representation methods, and topological maps.

(1) Grid map

The most common way for robots to describe environmental maps is as grid maps or occupancy maps. Grid maps divide the environment into a series of grids, where each grid is given a possible value, indicating the probability of the grid being occupied. This

kind of map looks no different from the map people know. It was first proposed by Alberto Elfes of A in 1989 and used on the Mars rover. It is essentially a bitmap image, but each "pixel" represents the probability distribution of obstacles in the actual environment. Generally speaking, this map can be used when using sensors such as lidar, depth cameras, ultrasonic sensors, etc. that can directly measure distance data for SLAM. This map can also be drawn by distance measurement sensors, ultrasound (early), and lidar (now).

(2) Feature point map

Feature point maps use relevant geometric features (such as points, lines, and surfaces) to represent the environment, and are commonly found in vSLAM (visual SLAM) technology.

Compared with raster maps, this type of map is less intuitive. It is generally generated by vSLAM algorithms such as GPS, UWB, and cameras in combination with sparse methods. The advantage is that the amount of data storage and computation is relatively small, and it is often seen in the earliest SLAM algorithms.

(3) Direct characterization method

In the direct representation method, the intermediate link of feature or grid representation is omitted, and the data read by the sensor is directly used to construct the robot's posture space.

This method is like a satellite map, which directly forms a map by simply processing and splicing the raw sensor data, which is relatively more intuitive.

(4) Topological map

A topological map is a relatively more abstract form of map. It represents the indoor environment as a topological structure diagram with nodes and related connection lines. The nodes represent important locations in the environment (corners, doors, elevators, stairs, etc.), and the edges represent the connection relationship between nodes, such as corridors. This method only records the topological link relationship of the environment. This type of map is generally extracted from the previous types of maps through related algorithms. For example, when a sweeping robot is cleaning a room, it will create a topological map like this:

(5) Summary

In robotics, SLAM map construction usually refers to building a map that is geometrically consistent with the environment. The topological map built in the general algorithm only reflects the connection relationship between the points in the environment, and cannot build a geometrically consistent map. Therefore, these topological algorithms cannot be used for SLAM. The direct representation method is similar to a satellite map. It is directly constructed using sensors (usually image sensors). This method has the largest information redundancy and is a great challenge for data storage. At the same time, it takes a lot of effort for robots to extract useful data from it, so it is rarely used in practical applications. The feature point map is another extreme. Although the amount of data is small, it often cannot reflect some necessary information about the environment, such as the location of obstacles in the environment. In

vSLAM technology, this type of map is often used to solve the problem of robot positioning. If you want the robot to perform autonomous obstacle avoidance and path planning, you need to configure additional distance sensors, such as laser radar and ultrasound. The grid map, or occupancy map, is just in between. On the one hand, it can represent many features in the spatial environment, and the robot can use it for path planning. On the other hand, it does not directly record the raw data of the sensor, which relatively achieves the optimal space and time consumption. Therefore, raster maps are the map storage method widely used by robots.

2. Positioning and sensor selection

When you open the navigation software on your phone, what is the first thing you should do before choosing the best route to your destination? That's right, it's positioning. We must first know our position in the map before we can carry out subsequent path planning. In the problem of real-time robot positioning, since the robot position information obtained by robot motion estimation usually has a large error, we also need to use the surrounding environment information obtained by the ranging unit to correct the robot's position.

Regarding the choice of positioning scheme, see here: Dry Goods | Common Positioning and Navigation Technologies and Advantages and Disadvantages of Service Robots At present, common ranging units include laser ranging, ultrasonic ranging and image ranging. Among them, with the good directivity and high focusing of laser, LiDAR has become the core sensor of mobile robots. At the same time, it is also the most reliable and stable positioning technology.

Since it was proposed in 1988, the theoretical research of SLAM has developed rapidly. In practical applications, in addition to equipped with LiDAR, the robot is also required to have IMU (inertial measurement unit) and odometer to provide auxiliary data for LiDAR. The computing consumption of this process is huge, and traditionally requires PC level, which has become one of the bottlenecks limiting the widespread application of SLAM.

3. Sensor data preprocessing

This is the main architecture diagram of a complete SLAM and navigation system:

The core SLAM process includes three steps, the first step is called preprocessing. We know that LiDAR, like other radar devices, can only obtain environmental information about its location at a certain moment. This is what we call a point cloud, which can only reflect a part of the robot's environment. The first step of preprocessing is to optimize the raw data of LiDAR, remove some problematic data, or perform filtering.

4. Matching

The second step is matching, that is, finding the corresponding position of the point cloud data of the current local environment on the established map. This step is very critical.

This is P's point cloud matching algorithm, which is used to achieve matching. This process is critical because its quality directly affects the accuracy of SLAM map construction. This process is a bit like playing a jigsaw puzzle, that is, finding similarities in the already assembled picture and determining where to put the new puzzle. In the SLAM process, the point cloud (red part) currently collected by the lidar needs to be matched and spliced into the original map.

If the matching process is not performed, the constructed map may become a mess, like this.

5. Map Fusion

After this part is completed, the third step, map fusion, is to splice the new data from the lidar into the original map, and finally complete the map update. Just like this picture, this process is always accompanied by the SLAM process.

There is a big difference between data fusion and simple mapping. Because in fact, there are certain errors in the world depicted by the sensor, or the environment happens to change at this time, such as a kitten breaking into the robot.

Therefore, the actual process to be carried out will be more complicated, requiring many probabilistic algorithms and filtering for fusion. Executing the above process one by one will eventually produce the grid map we see.

6. Loop Closure Problem

This process doesn't sound complicated, but it is very difficult to handle well. Here are a few examples, such as the so-called Loop Closure problem. If the matching algorithm is not good enough, or there is an unfortunate interference in the environment, when the robot goes around the environment, it will find that a corridor that should have been closed is disconnected. For example, a normal map should be like this:

If it is not handled well, the actual map will be like this:

For scenes with larger environments, loop closure problems have to be faced, but reality is always imperfect. Even sensors such as lidar are bound to have errors. The difficulty of the loop closure problem lies in the fact that when a small error occurs at the beginning, it will not be discovered until the robot goes around the loop. As the errors accumulate, it is found that the loop can no longer be closed. At this time, a big mistake has been made, and it is generally difficult to recover.

Of course, this problem is not unsolvable. For a good commercial SLAM system, whether the loop closure problem can be solved well becomes an indicator of the strength of the system. This is a test conducted in the "Silan" office two days ago. The map on the left is based on the open source robot, and the map on the right is based on SLAMWARE.

『Video not available』

After the robot has circled the field, the map built by ROS is interrupted, while the map built by SLAMWARE is a perfect closed loop, which perfectly overlaps with the design drawing of our office.

In addition to the loop problem at the algorithm level, there are many such pitfalls in the actual application of SLAM, such as corridor problems and external interference problems. As for the problem of external interference, usually, the lidar, as the eyes of the robot, is generally installed on the chassis, and its field of view is very limited. When interfered with by the outside world (humans or pets, etc.), the robot can easily lose positioning accuracy and cannot complete the subsequent mapping work normally.

When the robot is installed with SLAMWARE, the robot can be completely unaffected by interference and can still work normally. 『Video not available』 At present, the open source implementation of SLAM is mostly represented by academia, and there are many corner cases to be handled in actual applications, which require joint tuning of sensors, system parameters, and other auxiliary equipment. Generally speaking,

the above SLAM process consumes a lot of computing power. Although it has not reached the point where a server cluster is used to train a neural network, it traditionally requires a PC-level processor. In addition to the LiDAR, the robot also needs to have an IMU (Inertial Measurement Unit) and an odometer to provide auxiliary data for the LiDAR, otherwise the SLAM system will be difficult to operate. In general, the SLAM algorithm itself is an algorithm that has multiple dependencies on external systems, which is a practical engineering problem.