The three major binocular stereo vision algorithm principles and their code implementation

Publisher:tau29Latest update time:2024-08-16 Source: elecfans Reading articles on mobile phones Scan QR code
Read articles on your mobile phone anytime, anywhere

Binocular stereo vision has always been a hot spot and difficulty in the development of machine vision research. It is "hot" because it has a very broad application prospect, and with the continuous development of optics, computer science and other disciplines, binocular stereo technology will continue to improve until it is applied to all aspects of human life. It is "difficult" because it is limited by hardware equipment such as cameras, lenses and some related algorithms. The research on binocular stereo vision and how to better apply it to actual production still needs to be broken through by everyone present.


I. Introduction Binocular stereo vision is an important branch of machine vision. Since its creation in the mid-1960s, after decades of development, it is now widely used in robot vision, aerial mapping, military and medical imaging, and industrial inspection. Binocular stereo vision is based on the principle of parallax and uses imaging equipment to obtain the left and right images of the object to be measured from different positions, and then calculates the position deviation of the spatial point in the two-dimensional image according to the principle of triangulation, and finally uses the position deviation to perform three-dimensional reconstruction to obtain the three-dimensional geometric information of the object to be measured (this article does not introduce the mathematical principles of binocular stereo vision in detail).

2. The principles and code implementation of the three basic algorithms of binocular stereo vision (based on opencv)

The commonly used region-based local matching criteria in binocular stereo vision include the sum of absolute differences (SAD) of corresponding pixel differences in the image sequence, the sum of squared differences (SSD) of corresponding pixel differences, and the semi-global matching algorithm (SGM).

2.1 SAD (sum of absolute differences) principle

The basic idea of ​​the matching algorithm SAD is to sum the absolute values ​​of the corresponding pixel differences of the corresponding pixel blocks of the aligned left and right view images.

The mathematical formula is as follows:

cb0e7b10-17a2-11ee-962d-dac502259ad0.png

The basic process of the SAD matching algorithm is as follows: ① Input two left views (Left-Image) and right views (Right-Image) that have been corrected to achieve line alignment. ② Scan the left view Left-Image, select an anchor point and construct a small window similar to a convolution kernel. ③ Cover the Left-Image with this small window and select all the pixels in the area covered by the small window. ④ Also cover the Right-Image with this small window and select all the pixels in the area covered by the small window. ⑤ Subtract the pixels in the area covered by the Left-Image from the pixels in the area covered by the Right-Image, and calculate the sum of the absolute values ​​of the differences of all pixels. ⑥ Move the small window of the Right-Image and repeat operations ④-⑤ (Note that a search range will be set here, and it will jump out if it exceeds this range) ⑦ Find the small window with the smallest SAD value within this range, and then the pixel block that best matches the anchor point of the Left-Image has been found.

2.1.1 SAD (sum of absolute differences) C++ code implementation based on OpenCV

First, define a header file for the SAD algorithm (SAD_Algorithm.h): #include "iostream" #include "opencv2/opencv.hpp" #include "iomanip" using namespace std; using namespace cv; class SAD{public: SAD() :winSize(7), DSR(30) {} SAD(int _winSize, int _DSR) :winSize(_winSize), DSR(_DSR) {} Mat computerSAD(Mat &L, Mat &R); //Calculate SAD

private: int winSize; //Convolution kernel size int DSR; //Disparity search range};
Mat SAD::computerSAD(Mat &L, Mat &R){ int Height = L.rows; int Width = L.cols;

Mat Kernel_L(Size(winSize, winSize), CV_8U, Scalar::all(0)); Mat Kernel_R(Size(winSize, winSize), CV_8U, Scalar::all(0)); Mat Disparity(Height, Width, CV_8U, Scalar(0)); //View map

for (int i = 0; i //Left image starts from DSR { for (int j = 0; j { Kernel_L = L(Rect(i, j, winSize, winSize)); Mat MM(1, DSR, CV_32F, Scalar(0)); for (int k = 0; k { int x = i - k; if (x >= 0) { Kernel_R = R(Rect(x, j, winSize, winSize)); Mat Dif; absdiff(Kernel_L, Kernel_R, Dif); //Sum of absolute differences Scalar ADD = sum(Dif); float a = ADD[0]; MM.at(k) = a; } } Point minLoc; minMaxLoc(MM, NULL, NULL, &minLoc, NULL); int loc = minLoc.x; //int loc=DSR-loc; Disparity.at(j, i) = loc * 16; } double rate = double(i) / (Width); cout << "Completed" << setprecision(2) << rate * 100 << "%" << endl; //Show processing progress } return Disparity;}

Calling example: #include "SAD_Algorithm.h" int main(int argc, char* argv[]){ Mat Img_L = imread("Teddy_L.png", 0); //The image called here has been placed in the project folder Mat Img_R = imread("Teddy_R.png", 0); Mat Disparity; //Create disparity map

SAD mySAD(7, 30); //Give SAD parameters

Disparity = mySAD.computerSAD(Img_L, Img_R); imshow("Teddy_L", Img_L); imshow("Teddy_R", Img_R); imshow("Disparity", Disparity); //显示视差图

waitKey(); system("pause"); //Press any key to exit return 0;}

2.1.2 Operational Effect of SAD Algorithm

cb2b60f4-17a2-11ee-962d-dac502259ad0.png

It can be seen that although the SAD algorithm runs faster, the effect is poor.

2.2 SSD (sum of squared differences) principle

The SSD (sum of squared differences) algorithm is roughly similar to the SAD (sum of absolute differences). Its mathematical formula is as follows:

cbaaff9e-17a2-11ee-962d-dac502259ad0.png

Because the process and code implementation of the SSD matching algorithm are similar to those of the SAD matching algorithm, and considering the length of the article, the basic process and code implementation of the SSD algorithm will not be repeated in this article, and readers can implement it by themselves.

2.3 Principle of SGBM (semi-global block matching)

SGM (semi-global matching) is a semi-global matching algorithm used to calculate disparity in binocular stereo vision. Its implementation in OpenCV is SGBM (semi-global block matching). The principle of SGBM is to set a global energy function related to the disparity map (composed of the disparity of each pixel) to minimize this energy function. Original document: Heiko Hirschmuller. Stereo processing by semiglobal matching and mutual information. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 30(2):328–341, 2008. Its energy function is as follows:

cbc32128-17a2-11ee-962d-dac502259ad0.png

D--disparity map (disparity map) p, q—a certain pixel in the image Np—pixel point Pd adjacent pixels (generally considered to be 8-connected) C (P, Dp)--When the disparity of the current pixel is Dp, the cost of the pixel P1, P2—penalty coefficient, respectively applicable when the difference between the disparity value of the adjacent pixels of pixel P and the disparity of P is 1 and greater than 1 I[]—returns 1 when the parameter in [] is true, otherwise returns 0 The basic process of the SGBM algorithm is as follows: ① Preprocessing: Use the sobel operator to process the source image, and map the image processed by the sobel operator to a new image, and obtain the gradient information of the image for subsequent cost calculation. ② Cost calculation: Use the sampling method to calculate the gradient cost of the image gradient information obtained after preprocessing, and use the sampling method to calculate the SAD cost of the source image. ③ Dynamic planning: four paths are set by default, and the path planning parameters P1 and P2 are set (including the settings of P1, P2, cn (number of image channels) and SADWindowsize (SAD window size)). ④ Post-processing: including uniqueness detection, sub-pixel interpolation, left-right consistency detection, and connected area detection.

2.3.1 SGBM (semi-global block matching) C++ code implementation based on OpenCV

First, define a header file for the SGBM algorithm (SGBM_Algorithm.h): For specific parameters, see the code and its comments (if the reader needs to optimize, he can adjust it by himself), and I will not repeat it here. enum { STEREO_BM = 0, STEREO_SGBM = 1, STEREO_HH = 2, STEREO_VAR = 3, STEREO_3WAY = 4 }; #include "iostream" #include "opencv2/opencv.hpp" using namespace std; using namespace cv;

void calDispWithSGBM(Mat Img_L, Mat Img_R, Mat &imgDisparity8U){ Size imgSize = Img_L.size(); int numberOfDisparities = ((imgSize.width / 8) + 15) & -16; Ptrsgbm = StereoSGBM::create(0, 16, 3);

int cn = Img_L.channels(); //Number of channels of the left image int SADWindowSize = 9; int sgbmWinSize = SADWindowSize > 0 ? SADWindowSize : 3;

sgbm->setMinDisparity(0); //minDisparity minimum disparity defaults to 0;
sgbm->setNumDisparities(numberOfDisparities); //numDisparity disparity search range, its value must be an integer multiple of 16;

sgbm->setP1(8 * cn*sgbmWinSize*sgbmWinSize); sgbm->setP2(32 * cn*sgbmWinSize*sgbmWinSize); //It is generally recommended that the penalty coefficients P1 and P2 take these two values. P1 and P2 control the smoothness of the disparity map. //The larger the P2, the smoother the disparity map.

sgbm->setDisp12MaxDiff(1); //Maximum allowable error threshold for left-right consistency detection

sgbm->setPreFilterCap(31); //The cutoff value of the preprocessing filter. The preprocessing output value only retains the value within the range of //[-preFilterCap, preFilterCap]. The parameter range is: 1 - 31

sgbm->setUniquenessRatio(10); //Disparity uniqueness percentage: when the lowest cost within the disparity window is (1 + uniquenessRatio/100) times the second lowest cost, the disparity value corresponding to the lowest cost is the disparity of the pixel, otherwise the disparity of the pixel is 0. It cannot be a negative value, usually 5-15
sgbm->setSpeckleWindowSize(100); //The size of the number of pixels in the disparity connected area: for each disparity point, when the number of pixels in its connected area is less than //speckleWindowSize, the disparity value is considered invalid and is a noise point.

[1] [2]
Reference address:The three major binocular stereo vision algorithm principles and their code implementation

Previous article:Summary of PLC shortcut keys for Siemens/Mitsubishi/Omron
Next article:Research on the Difficult Problems of Surface Defect Detection Based on Machine Vision

Latest Embedded Articles
Change More Related Popular Components

EEWorld
subscription
account

EEWorld
service
account

Automotive
development
circle

About Us Customer Service Contact Information Datasheet Sitemap LatestNews


Room 1530, 15th Floor, Building B, No.18 Zhongguancun Street, Haidian District, Beijing, Postal Code: 100190 China Telephone: 008610 8235 0740

Copyright © 2005-2024 EEWORLD.com.cn, Inc. All rights reserved 京ICP证060456号 京ICP备10001474号-1 电信业务审批[2006]字第258号函 京公网安备 11010802033920号