With the development of computer science and automatic control technology, more and more different types of intelligent robots are appearing in factories and life. As an important subsystem of intelligent robot system, robot vision system is also receiving more and more attention. It involves fields such as image processing, pattern recognition and visual tracking. Different types of robots have different work focuses, so their vision systems have subtle differences in software or hardware. This paper studies the monocular vision system based on service robots. It processes two-dimensional images and is based on the recognition of the color and shape of unobstructed objects and the translational tracking of 3D target objects.
The visual system is a very complex system. It must not only accurately capture images, but also respond to external changes in real time, and also track external moving targets in real time. Therefore, the visual system places high demands on both hardware and software systems. The visual system of the currently popular soccer robot technology is a typical type of rapid recognition and response. Generally speaking, it uses color mark calibration to identify players and targets, and expands the prediction function of the Kalman filter to achieve target tracking. In terms of hardware, a ready-made camera is used to implement a robot's image acquisition system.
In terms of design, this system uses CMOS image sensors instead of CCD sensors to collect images, and uses DSP processing chip TMS320VC5509A for image processing and CPU control. In the design process, in order to intuitively show the recognition and tracking effects of the robot vision system, a TFT format LCD is specially used for intuitive display. In terms of software, part of the football robot's vision technology is used to achieve rapid recognition of the target, and the Jacobian matrix constructed by the global characteristic moment achieves adaptive tracking of the target.
1 Hardware design
Figure 1 is a functional module block diagram of the system hardware circuit.
1.1 Image acquisition
The visual lens images the external image information on the area array unit of the image sensor. Currently, there are two popular image sensors: area array CCD (Charged Coupled Device) and area array CMOS. Compared with CCD image sensors, the active pixel unit of CMOS image sensors provides an amplifier for each pixel, only requires a single power supply low logic level voltage, and the power consumption is only one tenth of CCD. The CMOS image sensor integrates the A/D conversion part and directly outputs digital signals. Based on these factors, this system uses the CMOS color image sensor OV7635 launched by Omnivision.
The resolution of OV7635 is 640X480, and it can output 8-bit data in three formats: YCbCr4:2:2 mode, RGB4:2:2 mode and RGB raw data mode. The maximum output VGA format can reach 30fps (fps: frames per second). It can work in progressive scanning and interlaced scanning. OV7635 has two working modes: master mode and slave mode. In master mode, the synchronization signal and clock are not controlled by peripheral devices. In slave mode, the field synchronization signal VSYNC, the line synchronization signal HREF of OV7635 and the crystal oscillator frequency XCLK of the system are all controlled by external devices. This system adopts the master mode. OV7635 configures the on-chip registers through the I2C bus to make it output raw data. After the system is powered on and reset, the CMOS register is initialized by the I2C bus signal of the DSP chip. Then OV7635 outputs the image signal as required. Including the line synchronization signal HREF, the field synchronization signal VSYNC, the pixel clock signal PCLK, and the digital image signal.
1.2 LCD Display
In order to intuitively see the recognition and tracking effect of the visual system on people, an INNOLUX PT035TN01 LCD screen is used. In order not to increase the burden on the DSP and to see the tracking effect of external target objects in real time, the data displayed by the LCD is not displayed through the DSP, but directly through the image data signal output by the sensor OV7635 and the CPLD control timing. PT035TN01 is a 3.5-inch TFT format LCD with a resolution of 320×3 (RGB)×240. The two input control pins IF1 and IF2 of the LCD select the input data format: serial RGB format, CCIR601 format, CCIR656 format. There are 4 scanning modes for the LCD. The input data format used in this visual system is CCIR601 format, and the scanning mode is from top to bottom and left to right.
In the CCIR601 format, the pixel clock PCLK output by the image sensor is divided by two by the CPLD as the working clock of the LCD, and the line synchronization signal HREF output by the image sensor is processed by the CPLD as the line synchronization signal HIS of the LCD. In this way, under the control of the CPLD, the data signal output by the image sensor OV7635 is sent to the LCD for display.
1.3 Timing Control
The field synchronization signal VSYNC, line synchronization signal HREF and pixel clock signal PCLK output by OV7635 are connected to the CPLD chip, which generates control signals to store the data signal output by OV7635 into the FIFO frame memory AL422B, and generates the clock and line synchronization signals of the LCD to control the display of the LCD. The CPLD uses the EPM7064 chip of ALTERA. The functions of writing control to FIFO, notifying DSP to read signals, and generating the clock signal of the LCD are completed in the CPLD.
CPLD receives the field synchronization signal VSYNC. The falling edge of this signal indicates the beginning of a frame output by the image sensor. At this time, CPLD generates a WRST negative pulse to reset the write pointer of FIFO. After the falling edge of the field synchronization signal VSYNC, it is determined that the rising edge of the line synchronization signal HREF has arrived. Then, the pixel clock signal PCLK is used as the write clock WCK to store the image data directly into the FIFO. When a certain number of data is stored, a signal is sent to the DSP in time so that the DSP can read the data. This system uses an interrupt INT0 to notify the DSP. At this time, the DSP can read the data or not, depending on the processing speed. When reading data, RD and chip select can be used to generate RCK signals. The reading speed of the DSP cannot be too fast, and the principle is that the reading speed is less than the writing speed.
In the logic timing control of the liquid crystal, since the image output signal is 640×480 pixels, and the liquid crystal display is in the format of 320×240. Therefore, the pixel clock signal PCLK input by the image sensor is divided into two frequencies by CPLD to generate the clock signal of the liquid crystal to control the display of the liquid crystal, and the line synchronization signal is effective for interlaced lines to achieve the display of the image by the liquid crystal. The program in the CPLD is written in the hardware description language VHDL and is written on the QUARTUSⅡ software platform. Since the chip used is the 44-pin PLCC package of the EPM7064S series, it can only work under the voltage of 5V, and its output high-level signal is 5V, which must be processed before it can be connected to the chip device working in the system under the state of 3.3V.
1.4 Frame Memory Selection
The frame memory includes RAM that needs external address lines and FIFO that does not. In order to simplify the design of CPLD, a FIFO frame memory is used. FIFO can be divided into DRAM based on dynamic storage and SRAM based on static storage. The advantage of static SRAM is that it does not require a refresh circuit, but the capacity is small and multiple chips are required to store one frame of data; the advantage of DRAM is that it has a large capacity and only one chip is needed to store one frame of data. The disadvantage is that a refresh circuit is required. This design uses Averlogic's large-capacity FIFO dynamic storage chip AL422B. Its refresh circuit is relatively simple, and only WCK or RCK needs to provide an uninterrupted pulse greater than 1M. The storage capacity of AL422B is 3MB. Since the information of one frame of the system usually contains 640×480 color pixels, each pixel occupies 2 bytes, it can store the complete information of one frame of image, and its operating frequency can reach 50MHz.
1.5 Video Processing DSP
When selecting DSP, we took into account processing speed, storage capacity, processing technology level under existing conditions, and cost performance, and chose TI's 144-pin packaged TMS320VC5509A. The maximum operating frequency of this chip can reach 200MHz, and it has a very high processing speed.
After receiving the read notification signal from CPLD, DSP starts to read the video data in AL422B. In order to facilitate data processing, an SDRAM is expanded outside the DSP. The chip used is HYNIX's HY57V161610E, and the storage capacity of this chip is 1M×16bits.
When the DSP is powered on and reset, it samples the status of GPIO0~GPIO3 and loads the program in a certain way according to the sampled status. This system uses the SPI port of the external flash memory chip to load the program to the DSP, and then initializes the register of the image sensor through the I2C port of the DSP. The image sensor starts to output signals. The whole system starts to work.
As a high-speed processor, DSP is mainly used for image processing. Since this visual system needs to complete the recognition and tracking functions, the amount of data processing is very large. While completing the image processing, DSP is also used as a controller to control the controller, thus forming a visual tracking system.
2 Software design
Since this system uses a combination of color and shape to identify unobstructed target objects, in order to achieve the purpose of real-time and fast robot recognition, the software method mainly adopts the color recognition method commonly used in soccer robots. The most common method is the color judgment method based on threshold vector. The following is a brief description of the color recognition principle.
2.1 Color space selection
When using color image segmentation-based methods to identify targets, we must first select a suitable color space, and commonly used color spaces include RGB, YUV, HSV, CMY, etc. The choice of color space directly affects the effect of image segmentation and target recognition.
RGB: It is the most commonly used color space, in which brightness is equal to the sum of the three components R, G, and B. The RGB color space is an uneven color space. The perceived difference between two colors is not linearly proportional to the Euclidean distance between two points in the space, and the correlation between the R, G, and B values is very high. For the same color attribute, under different conditions (light source type, intensity, and object reflection characteristics), the RGB values are very scattered. It is difficult to determine the threshold and distribution range of a specific color in the color space. Therefore, a color space from which the brightness component can be separated is usually selected, of which the most common are the YUV and HSV color spaces.
HSV: Close to the way the human eye perceives color, H stands for hue, S stands for saturation, and V stands for value. Hue H can accurately reflect the type of color and is less sensitive to changes in external lighting conditions, but H and S are both nonlinear transformations of R, G, and B, and there are singular points. Even a small change in the value of R, G, and B near the singular point will cause a large jump in the transformation value.
YUV: The RGB color space is linearly transformed into a brightness-color space. It was proposed to solve the compatibility problem between color TV and black and white TV. Y represents brightness (Luminance), and UV is used to represent color difference (Chrominance). The importance of YUV representation is that its brightness signal (Y) and chrominance signal (U, V) are independent of each other. The so-called color difference refers to the difference between the three component signals (i.e. R, G, B) in the primary color signal and the brightness signal.
Therefore, for the above reasons, this system adopts the YUV color space.
The relationship between YUV format and RGB is as follows:
[page]
2.2 Threshold determination and color judgment
When determining the threshold, firstly, samples are collected for training, so as to obtain upper and lower thresholds of components of several predetermined colors in the YUV space, as shown in FIG. 2 .
When the position of a pixel to be determined in the color space falls within this cuboid, the pixel is considered to belong to the color to be found, thus completing the recognition of the image color. In the Y space, the Y value represents brightness. Because it varies greatly, only the U and V values are considered. When making color judgments, the U and V threshold vectors are first established respectively.
Since the digital signal of the image sensor in the system is 8 bits, that is, 1 Byte, a total of 255 Byte, the system can determine up to 8 colors. Image segmentation is performed after color recognition. The seed filling algorithm is used in image segmentation. The filling of the entire seed is performed simultaneously with the color of the pixel point. At the beginning, not all pixels are processed, but blocks are processed. The block used in this system is 32×24 pixels, which greatly reduces the amount of calculation. When the center point is the color to be recognized, this point is used as a seed to spread around, and the color of the surrounding pixels is determined until the entire block is filled. In this process, the shape of the target is recognized at the same time. This system uses a recognition algorithm based on global eigenvectors for recognition. At the same time, the required moment feature quantity is obtained to construct the Jacobian matrix. Figure 3 is an image recognition and segmentation flow chart.
2.3 Principle of visual tracking software
When the target object is identified, the vision system will adjust the lens to place the target in the center of the field of view. Once the object moves, the vision system will track the target object.
In the robot visual tracking system, an uncalibrated visual tracking system is used. Uncalibrated visual tracking does not require the camera lens to be calibrated in advance, but applies the principle of adaptive control to adjust the image Jacobian matrix online in real time. Through the feedback of two-dimensional image feature information, this method is insensitive to camera model errors and robot model errors, image errors, and image noise. The visual tracking control system based on image tracking is shown in Figure 4.
The control quantity c is the control system of the robot head. First, the target is placed in front of the robot's field of view to capture the desired image, and the desired feature set is extracted from the desired image as the desired input of the field of view tracking control system, thereby completing the field of view feature set definition required for the task. In the real-time control system, the robot's image sensor obtains the real-time sampled image, from which the real-time feature set is obtained, thus forming a field of view feedback to guide the robot to complete the tracking task. Different from the simple geometric features of the image, the visual feature set selected by this system is the global image description-image moment.
According to the relationship matrix between the change of moment features and the change of relative posture, namely the image Jacobian matrix, a visual tracking controller is designed using the derived image Jacobian matrix to complete the system's translational tracking of 3D target objects.
3 Experimental Results
Figure 5 shows the waveform output by the DSP for the clkout pin, indicating that the internal clock circuit of the DSP is working properly. The image sensor output data waveform in Figure 6 proves that the image sensor is working properly. The image data collected by the DSP in Figure 7 can confirm that the entire image acquisition hardware circuit is working properly.
4 Conclusion
Aiming at the visual system of the service robot, this paper completes the design of the entire system by building its hardware system and software system. In the hardware system, a typical image acquisition system is formed by using CMOS image sensor, CPLD timing control, asynchronous dynamic FIFO data cache, and high-speed DSP processor, and the image signal is debugged and output. In the software design, the color recognition and color segmentation recognition technology of the football robot is used to complete the fast and accurate recognition of the visual system, and the control principle based on dynamic working mode and image-based Jacobian matrix is adopted to realize the adaptive compensation tracking control system.
Previous article:Remote Wireless Monitoring System Based on LabVIEW
Next article:The function and working principle of the single chip decoder
Recommended ReadingLatest update time:2024-11-16 19:32
- Popular Resources
- Popular amplifiers
- Virtualization Technology Practice Guide - High-efficiency and low-cost solutions for small and medium-sized enterprises (Wang Chunhai)
- Operational Amplifier Practical Reference Handbook (Edited by Liu Changsheng, Zhao Mingying, Liu Xu, etc.)
- Computer Vision Applications in Autonomous Vehicles: Methods, Challenges, and Future Directions
- Power Integrated Circuit Technology Theory and Design (Wen Jincai, Chen Keming, Hong Hui, Han Yan)
Professor at Beihang University, dedicated to promoting microcontrollers and embedded systems for over 20 years.
- Innolux's intelligent steer-by-wire solution makes cars smarter and safer
- 8051 MCU - Parity Check
- How to efficiently balance the sensitivity of tactile sensing interfaces
- What should I do if the servo motor shakes? What causes the servo motor to shake quickly?
- 【Brushless Motor】Analysis of three-phase BLDC motor and sharing of two popular development boards
- Midea Industrial Technology's subsidiaries Clou Electronics and Hekang New Energy jointly appeared at the Munich Battery Energy Storage Exhibition and Solar Energy Exhibition
- Guoxin Sichen | Application of ferroelectric memory PB85RS2MC in power battery management, with a capacity of 2M
- Analysis of common faults of frequency converter
- In a head-on competition with Qualcomm, what kind of cockpit products has Intel come up with?
- Dalian Rongke's all-vanadium liquid flow battery energy storage equipment industrialization project has entered the sprint stage before production
- Allegro MicroSystems Introduces Advanced Magnetic and Inductive Position Sensing Solutions at Electronica 2024
- Car key in the left hand, liveness detection radar in the right hand, UWB is imperative for cars!
- After a decade of rapid development, domestic CIS has entered the market
- Aegis Dagger Battery + Thor EM-i Super Hybrid, Geely New Energy has thrown out two "king bombs"
- A brief discussion on functional safety - fault, error, and failure
- In the smart car 2.0 cycle, these core industry chains are facing major opportunities!
- The United States and Japan are developing new batteries. CATL faces challenges? How should China's new energy battery industry respond?
- Murata launches high-precision 6-axis inertial sensor for automobiles
- Ford patents pre-charge alarm to help save costs and respond to emergencies
- New real-time microcontroller system from Texas Instruments enables smarter processing in automotive and industrial applications
- Analog Basics: How Sample and Hold Circuits Work and Ensure ADC Accuracy
- The role of the volatile keyword
- Apple's contract factory violated the rules by using student workers, Apple responded by suspending its new business cooperation
- AD sampling average current
- Assembling a single-phase solid-state relay circuit using discrete components
- Motor Control Advanced 3——PID Cascade Control (with full code download)
- [Evaluation of domestic FPGA Gaoyun GW1N-4 series development board] ——9. Internal OSC and rPLL IP core test
- Zero-Drift Amplifiers: Features and Benefits
- Interference issues
- Help, the SD card cannot communicate normally, and the measured pin voltage is abnormal