In order to enable the face recognition algorithm to run quickly, TI's DSP processor was selected, and a keyboard module and a PAL format output module were added. The PAL video signal can be collected and processed independently from the PC, and face positioning, feature extraction and face recognition can be independently run. In terms of hardware, the system adopts a memory switching system, so that the image data cache and reading are independently and simultaneously executed by the CPLD and DSP, shortening the data processing cycle and ensuring the real-time operation of the system. The software design includes: face positioning, eye positioning, sample storage and face recognition. The samples are automatically selected by the DSP, and a part of the pictures with equal size and the same eye distance are selected as training samples and samples to be recognized according to the size of the eye positioning and the face mark box. In the process of principal component analysis, the principal components are extracted to form the feature face space, the original samples are projected to a point in the space, and then sent to the KNN classifier for classification. The device is easy to carry, has low power consumption, and can be applied to other fields such as motion recognition and dynamic tracking through software design.
1 Face Detection Algorithm
The face detection system can be divided into face detection and face recognition modules, which can be further divided into four modules: face detection and positioning, normalization, feature extraction, and face recognition. Its detailed structure is shown in Figure 1.
1.1 Face Location
It is an important step to determine the position of the face through the obtained samples, select the appropriate face, and cut it out as a sample. The quality of facial feature location and feature extraction has a direct impact on the effect of face image recognition. First, determine the coordinates of the human eye (x1, y1) and (x2, y2), from which the coordinates of the upper left vertex and the lower right vertex of the square face can be indirectly obtained. Let them be (X1, Y1) and (X2, Y2) respectively. The detailed calculation method is as follows
In the formula, RH and RV are both empirical constants, which are taken as 2.0 and 3.5 respectively in the design process. In this way, the coordinates of the face area can be obtained in the original image, and its size varies with the size of the eye distance Widtheyes. However, as the input of PCA, the dimension of the input samples is required to be the same, so the image must be normalized. In the design, the obtained face area samples are scaled to 24×24. In addition, the image needs to be subjected to contrast adjustment and histogram equalization to improve the accuracy of recognition.
1.2 Face feature extraction
When designing a face recognition classifier, a picture is usually considered as a one-dimensional vector. Although this is different from the traditional way of considering a picture as a matrix, it can create favorable conditions for extracting eigenfaces using principal component analysis (PCA).
The method of feature face classification is to project an image to a point in a specific "face space". This "face space" consists of mutually orthogonal vectors. These vectors are important components for characterizing each face cluster. Images of different faces are far apart in this space, and different images of the same face are projected closer in this space. Therefore, the PCA method can be used to lay the foundation for the entire face recognition system.
In the first step, N samples are collected as training set X, and the sample mean m is calculated, as shown in formula (1):
Among them, xi∈sample training set X=(x1,x2,…,xN).
The second step is to find the scatter matrix S, as shown in formula (2):
According to the basic principle of PCA, the eigenvalue λi and the corresponding eigenvector ei of the scatter matrix must be found. Among them, ei is the principal component, and the size of its corresponding eigenvalue represents the amount of information it contains. Therefore, the eigenvalues need to be arranged from large to small in order of λ1, λ2, ... As shown in Figure 2, the left side is a face image reconstructed by the eigenvector corresponding to λ1, which can basically distinguish the outline of the face, and the right side is an image reconstructed by the eigenvector corresponding to λ100, which looks more like noise. If it is applied to the system, it is not conducive to recognition.
Assuming that p values are taken out, λ1, λ2, …, λp can determine the face space E = (e1, e2, …, eP). In this face space, the point projected to the space by each element in the training sample X can be obtained by formula (3):
The above formula gives us a p-dimensional vector after the original vector is reduced in dimension by PCA. The next step is to input it into the KNN classifier for classification.
1.3 Construction of KNN classifier
The implementation of KNN is divided into two steps: training and recognition. During training, the result of dimensionality reduction of each type of sample is used as the input of KNN. The K nearest neighbor algorithm classifies a test point x into the category that appears most frequently among its K nearest neighbors, starting from the test sample point and continuously expanding the area until it includes the K training sample points, and classifies the category of the test sample point into the category with the highest frequency among the most recent K training sample points. As shown in Figure 3, the circle represents the location of the data to be recognized. When the K value is 3, the 3 data in the solid circle are selected, and the recognition result is the category represented by the triangle; when the K value is 5, the 5 data in the dotted circle are selected, and the recognition result is the category represented by the square. Therefore, the selection of an appropriate K value has a great impact on the classification result. If the K value is too large, it may be classified more correctly, but at the same time, the performance is sacrificed and the computational complexity is increased. If the K value is too small, the computational complexity is greatly reduced, but the accuracy of the classification may be affected.
2 System Hardware Design
DSP6713 is selected in the system design. This is a floating-point processor of the C6000 series produced by TI. It adopts VLIW architecture, has a low number of equivalent cycles for instruction operation, and runs faster. The image acquisition uses an ordinary camera with PAL format output and an image encoding chip TVP5147 produced by TI. The chip supports multiple formats and multiple interface inputs, and can output video data in YUV format, while providing line synchronization signals and vertical synchronization signals. Data temporary storage is implemented using CPLD and SRAM. The design system composition is shown in Figure 4.
2.1 TVP5147 chip
When the system is powered on, the DSP first initializes the TVP5147 through the I2C bus. The DSP has its own I2C bus controller. The chip I2C address is controlled by the level of the chip pin I2CA. If the pin is connected to a high level, the I2C write address is 0xB8, otherwise it is OxBB.
If the system is initialized to output 10-bit YUV mixed video data from the Y[9..O] port, it can be known that its output conforms to the following timing, as shown in Figure 5.
The first line in the figure is the DATACLK signal, which is the data clock signal provided by the TVP5147 chip, and the second line is the data Y[9…0]. Before each line of the image begins, there will be 4 SAV signals, and similarly, there will be 4 EAV signals after the end. As shown in Figure 5, the data is in YCbCr format, and the data of each pixel is composed of 4 data, Cb, Y, Cr, and Y at a time. As can be seen from Figure 5, when the AVID signal is high, it indicates that the current data is valid data. This provides a reference signal for the CPLD to collect valid data. At the same time, the TVP5147 chip also outputs the FID signal, which is the odd and even field indication signal.
2.2 CPLD reads and writes SRAM
The memory selected is DS1265AB, which is an SRAM memory with the advantages of fast storage speed and can save data for 10 years when the system is powered off. DS1265 has a capacity of 1 MB, 20 address lines, 8 data lines, and WE, OE, and CE signal input terminals.
CPLD uses EPM7128, which has the advantages of low price and high counting frequency. Connecting the memory SRAM to the IO pin of CPLD and coordinating the timing can meet the requirements of SRAM reading and writing. The schematic diagram is shown in Figure 6.
The CPLD program is written so that the output timing meets the storage requirements of SRAM. Of course, for the specific requirements of the design, two SRAMs are used to store the data of the odd and even fields respectively. The switching of SRAM is realized by controlling multiple 74HC245s by the odd and even field signal FID. The detailed process is shown in Figure 7. When FID is at a high level, M1 and M4 are enabled. At this time, the CPLD inputs the address signal CPLDaddr into SRAM1, and the DSP inputs the address signal DSPaddr into SRAM2. At the same time, M6 and M8 are enabled. It can be seen from the figure that at this time, the CPLD is writing data to SRAM1, and the DSP is reading data from SRAM2. At the same time, M10 is enabled, and the CPLD signal CPLDctl controls the reading and writing of SRAM1, while the DSP signal DSPctl controls the reading and writing of SRAM2. When FID turns to a low level, SRAM1 and SRAM2 are exchanged. The system constructed in this way can record the two fields of video data at the same time, realizing the organic combination of CPLD and DSP. At this point, the task of the CPLD is to store valid image data in the corresponding SRAM. When the AVID pin of the TVP5147 chip rises, the address is set to the initial value 00h, that is, writing from the first address in sequence. At each rising edge of the data clock signal DATACLK, the Y[9…2] output by the TVP5147 is stored in the current address unit, and the Y0 and Y1 bits are discarded because the selected SRAM data bits are 8 bits. When the Y0 and Y1 bits are discarded, the accuracy of the image data is reduced, but the impact on the recognition effect is small. Then, with each rising edge of DATACLK, the CPLD adds 1 to the address unit, so that each field of data is written. When switching to another field of data, the execution process is the same, except that the stored object is forced to change by 74HC245. In this way, each field of data can be recorded.
2.3 Design of image output system
In order to reduce the burden when designing the system, the TV monitoring method is adopted. The small TV is connected to the DSP bus through TI's video encoding chip THS8135. The obtained YUV data is directly output to the video receiving end of the TV AV through THS8135, and some information can be displayed on the TV screen through DSP, which makes the recognition process more user-friendly.
3 System Software Design
After the system hardware is successfully debugged, certain software algorithms need to be provided to achieve the combination of software and hardware. In this design, SRAM is extended on the DSP processor EMIF. The DSP processor stores the valid parity field data as two one-dimensional arrays respectively through the triggering of the read signal for processing.
3.1 DSP Image Preprocessing
The image data output by the TVP5147 chip is not in RGB format, but in YUV format. It needs to be converted into RGB format by the DSP processor before image preprocessing can be performed. The conversion formula is shown in formula (4):
DSP reads the image data into the memory space, then performs operations on it, puts the obtained RGB into the corresponding storage units, and calculates the gray value Gray. The operation formula is shown in formula (5):
The final grayscale value is stored in the corresponding array. Each picture is composed of two fields, so the resolution of the complete picture is 720×576. However, the system itself does not need to convert every pixel, so 320×240 is intercepted and stored, so the resolution of each field is 320×120, which greatly reduces the time from YUV to grayscale image preprocessing and face positioning, and improves the performance of the system.
3.2 Face Recognition Process
After the face detection of the obtained 320×240 picture, the face part will be cut off as the face sample. When designing, all samples of the face will be displayed on the monitor, which reduces the possibility of false face detection and improves the accuracy of the system to a certain extent.
The sample resolution of the face is 24×24, and it is input to PCA as a one-dimensional vector of 576 dimensions. Figure 8(a) is a flowchart for calculating the PCA projection matrix, and Figure 8(b) is a workflow diagram of the KNN classifier, in which the values of the training samples after PCA projection do not need to be recalculated in each recognition. They can be used as calculations during initialization or stored in non-volatile media when power is off, such as Flash memory, which can improve the operating efficiency of the device and reduce the amount of calculation.
As shown in Figure 8, the KNN classifier can determine the closest classification, but it cannot reject the classification, so any person's face will be classified into a category in the built-in sample set. This classification method is not desirable, so it is necessary to add a rejection judgment, as shown in the flowchart in Figure 9.
As shown in the flowchart, after the sample points are reduced in dimension by PCA, they are sent to the KNN classifier for classification. The result can be determined to be the Kth class. At this time, we should not rush to draw conclusions. We should first find the sum of the Euclidean distances between the test point and the sample points with K class labels. Define two thresholds a and b. If sum < a value, it is determined to be the first class; if sum > b value, it is determined to be the rejection class; if sum is between a and b values, introduce the accuracy control value, calculate the difference between sum and a, and if it is less than the accuracy control value, it is determined to be the Kth class, otherwise the classification is rejected. This process indirectly solves the problem of sample misclassification and inability to judge.
4 Test results
In this experiment, the value of a is selected as 12 400 and the value of b is selected as 16 200. The determination of these two values requires a lot of experiments to find out the rules. The value of x directly affects the recognition effect. In this paper, x=4 and x=5 are selected for testing.
(1) When x=4: When the program tested 36 face images belonging to 12 people in the recognizable database, it correctly identified 33 of them, the remaining 3 images were judged as negative, and 0 images were judged as wrong. When the program tested 33 face images belonging to 3 people in the unrecognizable database, 22 images were successfully judged as negative, and 11 images were misjudged;
(2) When x=5: When the program tested 36 face images belonging to 12 people in the recognizable database, it correctly identified 25 of them, the remaining 11 images were judged as negative, and 0 images were judged as wrong. When the program tested 33 face images belonging to 3 people in the unrecognizable database, 28 images were successfully judged as negative, and 5 images were judged as wrong.
From the analysis of the experimental data above, we can see that when x=4, the recognition rate of the recognizable library is 91.6%, and the rejection rate of the unrecognizable library is 66.7%. When x=5, the recognition rate of the recognizable library is 69.4%, and the rejection rate of the unrecognizable library is 84.8%. Therefore, different x values should be selected when applied to different occasions. When it is required to reject foreign faces as much as possible, the optional x value is 5, and when it is required to recognize known faces as much as possible, the optional x value is 4.
5 Conclusion
The construction of this face recognition system takes full account of its promotion. It does not use a USB camera as an image acquisition device, but replaces it with an analog camera of a universal video standard, so users can be more free when choosing a camera. At the same time, the device supports multiple interface inputs. In addition to the ordinary R-jack port, it also provides Svideo, YPbPr and RGB input methods. The recognition accuracy of the device can reach more than 90%, which basically meets the recognition requirements. The system has good real-time performance and is easy to carry. It can be promoted to dynamic image tracking, motion detection and other fields through program modification.
Previous article:Implementation of EPP Mode Communication of Computer Parallel Port Based on CPLD
Next article:Design of Digital Frequency Meter Based on TMS320F2812
Recommended ReadingLatest update time:2024-11-16 19:53
- Popular Resources
- Popular amplifiers
- Huawei's Strategic Department Director Gai Gang: The cumulative installed base of open source Euler operating system exceeds 10 million sets
- Analysis of the application of several common contact parts in high-voltage connectors of new energy vehicles
- Wiring harness durability test and contact voltage drop test method
- Sn-doped CuO nanostructure-based ethanol gas sensor for real-time drunk driving detection in vehicles
- Design considerations for automotive battery wiring harness
- Do you know all the various motors commonly used in automotive electronics?
- What are the functions of the Internet of Vehicles? What are the uses and benefits of the Internet of Vehicles?
- Power Inverter - A critical safety system for electric vehicles
- Analysis of the information security mechanism of AUTOSAR, the automotive embedded software framework
Professor at Beihang University, dedicated to promoting microcontrollers and embedded systems for over 20 years.
- Innolux's intelligent steer-by-wire solution makes cars smarter and safer
- 8051 MCU - Parity Check
- How to efficiently balance the sensitivity of tactile sensing interfaces
- What should I do if the servo motor shakes? What causes the servo motor to shake quickly?
- 【Brushless Motor】Analysis of three-phase BLDC motor and sharing of two popular development boards
- Midea Industrial Technology's subsidiaries Clou Electronics and Hekang New Energy jointly appeared at the Munich Battery Energy Storage Exhibition and Solar Energy Exhibition
- Guoxin Sichen | Application of ferroelectric memory PB85RS2MC in power battery management, with a capacity of 2M
- Analysis of common faults of frequency converter
- In a head-on competition with Qualcomm, what kind of cockpit products has Intel come up with?
- Dalian Rongke's all-vanadium liquid flow battery energy storage equipment industrialization project has entered the sprint stage before production
- Allegro MicroSystems Introduces Advanced Magnetic and Inductive Position Sensing Solutions at Electronica 2024
- Car key in the left hand, liveness detection radar in the right hand, UWB is imperative for cars!
- After a decade of rapid development, domestic CIS has entered the market
- Aegis Dagger Battery + Thor EM-i Super Hybrid, Geely New Energy has thrown out two "king bombs"
- A brief discussion on functional safety - fault, error, and failure
- In the smart car 2.0 cycle, these core industry chains are facing major opportunities!
- The United States and Japan are developing new batteries. CATL faces challenges? How should China's new energy battery industry respond?
- Murata launches high-precision 6-axis inertial sensor for automobiles
- Ford patents pre-charge alarm to help save costs and respond to emergencies
- New real-time microcontroller system from Texas Instruments enables smarter processing in automotive and industrial applications
- 3. [Record] Two library files that must be installed by the GCC compiler
- Analysis of the characteristics of IQ modulators
- Dating Spring---I am only one step away from nature working overtime
- Wanted FRDM-KL25Z
- Analysis of Embedded C Language Pointers
- APM32E103 MINI development board information (software resource package, schematic diagram, user manual, etc.)
- Is there a dual Schottky diode similar to BAT54x that can pass a larger current (0.6A*2)?
- [RVB2601 Creative Application Development] Dynamically loading MBRE JPEG decoder transplant source code and test results
- The chip does not work when powered on
- Right angle turn without amplitude