Robot automatic recognition and grasping system based on vision and ultrasonic technology-EEWORLD

Collect

Visual sensors can directly reflect the external information of objects, but a single camera can only obtain a two-dimensional image of an object. Although stereoscopic vision can provide three-dimensional information, it is difficult to identify objects with the same appearance but only different depths (such as objects with holes, stepped objects, etc.), and it has certain requirements for ambient light. Since ultrasonic sensors are insensitive to light, object materials, etc., have simple structures, and can directly obtain the distance from the point to be measured to the sensor, this paper adopts a method that combines vision and ultrasonic measurement to fuse and infer the two-dimensional image information with the depth information obtained by the ultrasonic sensor, automatically identify and spatially locate the workpiece to be assembled, and determine the spatial position and posture of the end effector of the manipulator, so that it can accurately grasp the workpiece at the appropriate position.

1 System Principle and Structure

The system consists of a manipulator, a CCD visual sensor, an ultrasonic sensor, and corresponding signal processing units. The CCD is installed on the end effector of the manipulator to form a hand-eye vision. The receiving and transmitting probes of the ultrasonic sensor are also fixed on the end effector of the robot. The CCD obtains the two-dimensional image of the object to be identified and grasped, and guides the ultrasonic sensor to obtain depth information. The system structure is shown in Figure 1.

Image processing mainly completes the accurate description of the object's shape, including the following steps: a. Image edge extraction; b. Contour tracking; c. Feature point extraction; d. Curve segmentation and segment matching; e. Graphic description and recognition. After extracting the edge of the object image, contour tracking is used to refine the edge, remove pseudo edge points and noise points, and Freeman code the edge points that make up the closed curve, record the direction of each chain code and the XY coordinate values of each point on the curve, so as to facilitate further analysis of the geometric characteristics of the object. This study improves the search direction and order of edge points in the traditional contour tracking algorithm, and adopts a method of timely eliminating redundant points during the search process, which reduces the amount of data and calculation time, and has a good noise reduction and smoothing effect. When extracting image feature points, the polygonal approximation method is combined with the curvature calculation method to overcome the shortcomings of the polygonal approximation method that easily generates pseudo feature points and the curvature calculation method that is too computationally intensive. CCD After the acquired object image is processed, certain features of the object can be extracted, such as the coordinates of the object's centroid, area, curvature, edge, corner point and short axis direction. Based on these feature information, a basic description of the object's shape can be obtained. On the basis of image processing, the ultrasonic sensor is guided by visual information to measure the depth of the test point to obtain the depth (height) information of the object, or move along the test surface of the workpiece. The ultrasonic sensor continuously collects distance information and scans to obtain the distance curve. According to the distance curve, the edge or shape of the workpiece is analyzed [1]. After the computer fuses and infers the visual information and depth information, it performs image matching and recognition, and controls the robot to accurately grasp the object in a suitable posture.

2.1 Extraction of workpiece image edge

Complex artifacts are often reflected in more than one grayscale level in the image, and it is impossible to extract meaningful edges using only one grayscale threshold.

If the multi-threshold method is used, the calculation time and the complexity of image processing will inevitably increase. For the category variance automatic threshold method, increasing the threshold value will not only increase the complexity of data processing, but also when the threshold value is more than 2, the reliability of the algorithm will be affected. For this reason, the method of extracting edges directly from grayscale images is adopted. Image edges generally occur at the discontinuity of grayscale function values, which can be obtained by the first-order or second-order derivatives of grayscale functions. Classical methods of extracting edges using first-order derivatives include Roberts operator, So2bel operator, etc., and methods of extracting edges using second-order derivatives include Laplacian operator and Marrs2Hilderth operator, etc. Through the analysis and comparison of several algorithms, it is believed that the Sobel operator is not only easy to implement and fast in operation, but also can provide the most accurate edge direction estimation. The Sobel operator consists of two 3 × 3 operators with a difference of 90°. By convolving these two operators with the image, the edge and its direction of the image can be obtained. For a digital image {f (i, j)}, the Sobel operator can be expressed as:

Gx (i, j) = f (i - 1, j - 1) +2 f (i - 1, j) + f (i - 1, j + 1) - f (i + 1, j - 1) - 2 f (i + 1, j) - f (i + 1, j + 1);

Gy (i, j) = f (i - 1, j - 1) +2 f (i, j - 1) + f (i + 1, j - 1) - f (i - 1, j + 1) - 2 f (i, j + 1) - f (i + 1, j + 1).

After using G1 = | Gx | + | Gy| to obtain the gradient amplitude, an amplitude threshold can be set to reduce the number of edges extracted, that is, only those edges with large grayscale changes are considered. Then, the edge is refined by taking advantage of the fact that edge points have the largest local amplitude. After using the Sobel operator to extract the edge, in order to obtain the size information of the workpiece surface, the corner points of the image must also be extracted [2] in order to calculate the feature information such as the side length of the workpiece.

2.2 Determination of centroid coordinates

The calculation of the centroid point in an image can usually be obtained by two methods: one is to calculate the centroid coordinates by the method of regional processing and moment calculation; the other is to calculate by edge chain code integral. This algorithm is relatively simple and applicable to any graphics, but it needs to be combined with the pixel point region division algorithm.

2.3 Determination of axial direction

In order for the robot to accurately grasp the object with the correct posture, the axial direction of the object must be accurately determined. In geometry, the long axis of an object is defined as a straight line passing through the centroid of the object, and the second-order moment of the object about this line is the minimum. Let the angle between the long axis of the object in the image and the positive direction of the X-axis of the image plane be θ, and stipulate that | θ| ≤π/ 2, then the second-order moment of the object about this axis is

This algorithm is relatively simple and applicable to any graphics, but it needs to be combined with the pixel point region division algorithm.

2.3 Determination of axial direction

Obviously, the axial determination method based on the second-order moment of inertia is to calculate the entire object area, and the region to which the pixel points belong must be determined first, so the amount of calculation is large. Figure 2 (a) is the axis of the workpiece determined by this algorithm. For some objects with simple shapes, the following simple axial estimation algorithm can be used:

a. Determine the centroid coordinates of the object;

b. Determine the point closest to the object's centroid in the first half of the closed curve of the object's edge contour, estimate the tangent direction of the point using the least squares method, and set the angle between the tangent direction and the positive direction of the image plane's X axis to be α1;

c. Use the same method to determine the corresponding tangent direction α2 in the lower half of the curve;

d. The axial direction of the object can be roughly estimated as θ = (α1 + α2) / 2.

Figure 2 (b) is the axial image of the workpiece obtained using a simplified algorithm. This algorithm only processes the edge contour points of the object, which greatly reduces the calculation time.

3 Ultrasonic depth detection

Since the image obtained by the CCD camera cannot reflect the depth information of the workpiece, it is impossible to correctly identify the workpiece with the same two-dimensional graphics but only slightly different heights by using only visual information. This paper uses an ultrasonic ranging sensor to make up for this deficiency. After obtaining the edge, centroid and other feature quantities of the workpiece through image processing, the robot is guided to the point to be measured, the depth of the workpiece is measured, and the visual signal is fused with the ultrasonic signal to obtain more complete workpiece information. The ultrasonic sensor installed on the robot end effector consists of a transmitting and receiving probe. According to the principle of sound wave reflection, the sound wave signal reflected from the point to be measured is detected, and the depth information of the workpiece is obtained after processing. In order to improve the detection accuracy, variable threshold detection, peak detection, temperature compensation and phase compensation technologies are used in the receiving unit circuit [1] to obtain higher detection accuracy. For two cylindrical workpieces with exactly the same appearance and a height difference of 0.1 mm in the field of view, the method of fusing image and depth information proposed in this paper can be used to accurately identify and grasp them.

4 Experimental results and conclusions

Based on the above method research, an object recognition and grasping experiment was completed on the MOVEMASTER2EX robot assembly platform. Under natural light and general lighting conditions, 3 to 5 typical workpieces of different shapes and sizes placed arbitrarily within the field of view of the robot assembly platform were automatically recognized and grasped. The results showed that the recognition time was less than 5 s (including the movement time of the manipulator during the recognition, positioning and grasping processes), the positioning error was less than ±2 mm, and it had good versatility and portability. Figures 3 (a) to (d) are images of the recognition process of the workpiece to be grasped.

The experimental results show that the proposed detection device combining robot hand-eye vision with ultrasonic ranging, as well as the method of workpiece identification and grasping by fusing two-dimensional image information and depth information, can accurately identify and locate objects. It has the characteristics of simple algorithm, small calculation amount, good real-time performance and high reliability. It can provide information such as object shape, category and size for the interaction between the robot and the environment, so that the robot assembly operation can adapt to various complex environments and process processes, and has good application prospects for realizing the automation, flexibility and intelligence of industrial production processes.