3.2 Visual-Inertial SLAM
IMU sensors can provide a good solution to the problem of tracking failure when the camera moves into a challenging environment (less texture and/or lighting changes), on the other hand, visual sensors can compensate for the accumulated drift of the IMU. This combination of vision and IMU is called a golden pair. Due to the complementary functions of cameras and IMUs, it has good development prospects in fields such as unmanned driving (Sun and Tian, 2019). The main approach of VI-SLAM is to combine IMU information into the front end of the visual SLAM system, which is also called the visual inertial odometry (VIO) system. Generally, VI-SLAM systems can be divided into two categories: filter-based methods and optimization-based methods: 3.2.1 Feature-based methods In 2007, Mourikis and Roumeliotis (2007) proposed the multi-state constrained Kalman filter (MSCKF), which is the earliest visual inertial SLAM system based on the extended Kalman filter (EKF) algorithm.
Compared with pure visual odometry, MSCKF (Figure 5 (a)) can adapt to more drastic motion and texture loss within a certain period of time and has higher robustness. In 2012, Stephan (2012) proposed SSF (Figure 5 (b)), which is a time delay compensated single sensor and multi-sensor fusion framework based on EKF and loose coupling method. In 2013, Li and Mourikis (2013) pointed out the inconsistency of MSCKF in the state estimation process. In 2017, Paul et al. (2017) proposed MSCKF2.0, which greatly improved the accuracy, consistency and computational efficiency. In addition, ROVIO (Robust Visual Inertial Odometry) (Bloesch et al., 2015) (Figure 5 (c)) and MSCKF-VIO (Ke et al.) (Figure 6 (d)) are also excellent works based on filtering methods in recent years;
3.2.2 Optimization-based methods
As far as optimization-based VI-SLAM systems are concerned, the most classic framework is OKVIS. In 2015, Leutinegge et al. proposed OKVIS, which uses IMU measurements to predict the current state, spatial points, and 2D image features to constitute the reprojection error. The predicted IMU state quantity and the optimized parameters constitute the IMU error term, and then the reprojection error is combined with the IMU error for optimization. In 2017, Tong et al. (2017) proposed VINS-Mono, which is regarded as an excellent monocular VI-SLAM system, using an optical flow method at the front end and a sliding window-based nonlinear optimization algorithm at the back end (Cheng et al., 2021b). In addition, the initialization method of VINS-Mono is noteworthy, which adopts a disjoint method (as well as VI-ORBSLAM Mur Artal and Tards, 2017), which first initializes the pure vision subsystem and then estimates the bias, gravity, scale, and velocity of the IMU (accelerometer and gyroscope).
VINS Mono has been shown to have comparable positioning accuracy to OKVIS through tests on the KITTI and EuRoC datasets, with more complete and robust initialization and loop closure phases. In 2019, the VINS-Mono team proposed a stereo version that integrates GPS information, VINS-Fusion (Tong et al., 2019). As shown in Figure 6 (c), it achieves good positioning and mapping results in outdoor environments due to the addition of GPS measurements, and is considered a good application in the field of autonomous vehicles. In 2020, Campos et al. (2020) proposed a feature-based tightly integrated visual-inertial SLAM system ORB-SLAM3. This is the latest achievement of a more efficient initialization process achieved through the maximum a posteriori (MAP) algorithm, and it implements multi-map capabilities that rely on a new place recognition method with improved recall. In addition, the system is able to perform visual, visual-inertial, and multi-map SLAM using monocular, stereo, and RGB-D cameras. The experimental results of outdoor scenes are shown in Figure 6 (d).
The pipeline of ORB-SLAM3 is similar to that of ORB-SLAM2, and the whole system consists of three threads: tracking, local mapping, and loop closure threads. In addition, ORB-SLAM3 can survive long periods of bad visual information. When it is lost, it starts a new map, and when the map area is revisited, it will merge seamlessly with the previous map. Table 3 summarizes the main algorithms in the visual inertial SLAM framework in recent years. At present, the optimization-based VI-SLAM method has become the mainstream. In addition to the above methods, there are other state-of-the-art works that can be summarized as follows, but not limited to BASALT, Kimera, ICE-BA, Maplab, StructVIO.
3.3 Testing and Evaluation
In order to intuitively understand the positioning effect of the above SLAM methods, some typical algorithms were tested on the same airborne computer equipped with Intel Core i7-9700 CPU, 16 GB RAM and Ubuntu18.04+Melodic operating system, and compared with one of our previous works (Cheng et al., 2021a). As described in Cheng et al. (2021a), an improved trust region iteration strategy was proposed based on the traditional Gauss-Newton (GN) linear iteration strategy, and then the strategy was integrated into the VI-ORBSLAM framework (Mur-Artal and Tards, 2017) to achieve faster initialization and higher positioning accuracy. The model of the trust region iteration strategy is shown in Figure 7. It combines the steepest descent algorithm and the GN algorithm to approximate the objective function with a trust model. When the solution is considered to be the minimum of the model function near the current point, the minimization subproblem is solved in each iteration step.
The initial parameters that need to be estimated include scale factors, velocity, gravity, and biases of the accelerometer and gyroscope. In order to make all variables observable, a pure ORB-SLAM system takes several seconds to execute. The specific steps of this method are as follows: First, a visual initialization process is performed, including ORB extraction, map initialization, and initial pose estimation. Secondly, the IMU camera is frequency aligned using IMU pre-integration technology to generate key frames. Third, an improved trust region-based iterative strategy is proposed for gyro bias estimation, and the gravity direction is refined. Finally, the accelerometer bias and visual scale are estimated based on previous estimates. The pipeline of the previous work of the paper is shown in Figure 8.
The 2D trajectory of the algorithm on the EuRoC dataset V2_01_easy sequence is shown in Figure 9. It can be seen that the test results of each algorithm have different degrees of deviation compared with GT. The trajectory of the paper algorithm (red line) is closer to GT (black dashed line), while VI-ORBSLAM (blue line) has the largest drift. The position change curves in the X, Y, and Z directions are shown in Figure 10. The comparison curves of the Euler angles (i.e., roll, pitch, and yaw) are shown in Figure 11. Table 4 shows the quantitative root mean square error (RMSE) results and frame rates in the same CPU platform (i7-9700 CPU) tested throughout the 11 sequences. Because all algorithms use multi-threading, the third column of Table 4 reports the frame rate when processing the image stream. Figures 12 and 13 provide the RMSE and cumulative distribution function (CDF) of the translation error, respectively, and Figures 14 and 15 provide the RMSE and cumulative distribution function (CDF) of the orientation error, respectively.
It can be seen that the previous work of the paper, a fast monocular visual inertial system with an improved iterative initialization strategy method, achieved the best positioning accuracy in almost all sequences. In fact, due to the excellent initialization process, the method of the paper provides the best orientation performance on six sequences and seven sequences, and the system can quickly restart work even if it cannot extract ORB features.
3.4 Vision-LIDAR SLAM
Vision and LiDAR have their own advantages. For example, vision can obtain a large amount of texture information from the environment and has strong scene recognition capabilities, while LADAR does not rely on light, has good reliability, and has higher distance measurement accuracy. Therefore, in the field of autonomous driving, the SLAM system that integrates vision and LiDAR can provide a smarter and more reliable environment perception and state estimation solution. It follows the classic SLAM architecture with three main steps: (i) data processing step; (ii) estimation; (iii) global mapping step. According to the different proportions of vision and LiDAR in the SLAM system, the vision-LiDAR SLAM scheme can be divided into three categories: vision-guided method, LiDAR-guided method, and vision-LiDAR mutual correction method.
3.4.1 Visual guidance method
Visual SLAM, especially for monocular visual SLAM, is always unable to effectively extract the depth information of feature points, while LIDAR is an expert in this area. In order to make up for the shortcomings of visual SLAM, researchers have tried to integrate LIDAR data into visual SLAM systems. The representative work of visual guided SLAM is LIMO (Graeter et al., 2018). This method projects the spatial point cloud obtained by the lidar onto the image plane to estimate the scale of the visual features, and then constructs the error term of the visual feature scale restored by the lidar and the feature scale estimated from the camera pose as a constraint for back-end optimization.
Shin et al. (2018b) proposed a method to obtain sparse depth point clouds for visual SLAM using LIDAR. Since the resolution of the camera is much higher than that of LIDAR, this method has the problem that a large number of pixels have no depth information. To solve this problem, De Silva et al. (2018) used a Gaussian regression model to interpolate the missing depth values after calculating the geometric transformation between the two sensors. This method uses LiDAR to directly initialize the features detected in the image, and its effect is the same as that of the method using RGB-D sensors. There are also some studies that integrate LiDAR into visual SLAM to improve the application value of the solution, such as reducing costs, improving performance, and enhancing system robustness.
Previous article:A brief discussion on the challenges of building an 800V public fast charging network
Next article:Advantages and disadvantages of multi-speed electric drive system for autonomous driving
- Huawei's Strategic Department Director Gai Gang: The cumulative installed base of open source Euler operating system exceeds 10 million sets
- Analysis of the application of several common contact parts in high-voltage connectors of new energy vehicles
- Wiring harness durability test and contact voltage drop test method
- Sn-doped CuO nanostructure-based ethanol gas sensor for real-time drunk driving detection in vehicles
- Design considerations for automotive battery wiring harness
- Do you know all the various motors commonly used in automotive electronics?
- What are the functions of the Internet of Vehicles? What are the uses and benefits of the Internet of Vehicles?
- Power Inverter - A critical safety system for electric vehicles
- Analysis of the information security mechanism of AUTOSAR, the automotive embedded software framework
Professor at Beihang University, dedicated to promoting microcontrollers and embedded systems for over 20 years.
- Innolux's intelligent steer-by-wire solution makes cars smarter and safer
- 8051 MCU - Parity Check
- How to efficiently balance the sensitivity of tactile sensing interfaces
- What should I do if the servo motor shakes? What causes the servo motor to shake quickly?
- 【Brushless Motor】Analysis of three-phase BLDC motor and sharing of two popular development boards
- Midea Industrial Technology's subsidiaries Clou Electronics and Hekang New Energy jointly appeared at the Munich Battery Energy Storage Exhibition and Solar Energy Exhibition
- Guoxin Sichen | Application of ferroelectric memory PB85RS2MC in power battery management, with a capacity of 2M
- Analysis of common faults of frequency converter
- In a head-on competition with Qualcomm, what kind of cockpit products has Intel come up with?
- Dalian Rongke's all-vanadium liquid flow battery energy storage equipment industrialization project has entered the sprint stage before production
- Allegro MicroSystems Introduces Advanced Magnetic and Inductive Position Sensing Solutions at Electronica 2024
- Car key in the left hand, liveness detection radar in the right hand, UWB is imperative for cars!
- After a decade of rapid development, domestic CIS has entered the market
- Aegis Dagger Battery + Thor EM-i Super Hybrid, Geely New Energy has thrown out two "king bombs"
- A brief discussion on functional safety - fault, error, and failure
- In the smart car 2.0 cycle, these core industry chains are facing major opportunities!
- The United States and Japan are developing new batteries. CATL faces challenges? How should China's new energy battery industry respond?
- Murata launches high-precision 6-axis inertial sensor for automobiles
- Ford patents pre-charge alarm to help save costs and respond to emergencies
- New real-time microcontroller system from Texas Instruments enables smarter processing in automotive and industrial applications
- Conformal Array for Radar Missile Seeker
- TMS570LS1224_GY30 light sensor driver
- Good information on clock division
- A question about the STM32L073 port A driver
- 【Gravity:AS7341 Review】Color Temperature Perception Measurement: Analysis of Three Chip Performance
- Problems with devices that measure temperature
- Huawei Science Comic: How do Bluetooth, Wi-Fi, and GNSS work together? Connectivity Chip
- InstaSPIN Brushless DC (BLDC) Lab
- [Synopsys IP Resources] How to cope with the challenges of HPC SoC development in the era of surging data volume
- Discussion on the selection of charging management IC with path management