Design of face recognition system based on DSP-EEWORLD

Collect

In order to enable the face recognition algorithm to run quickly, TI's DSP processor was selected, and a keyboard module and a PAL format output module were added. The PAL video signal can be collected and processed independently from the PC, and face positioning, feature extraction and face recognition can be independently run. In terms of hardware, the system adopts a memory switching system, so that the image data cache and reading are independently and simultaneously executed by the CPLD and DSP, shortening the data processing cycle and ensuring the real-time operation of the system. The software design includes: face positioning, eye positioning, sample storage and face recognition. The samples are automatically selected by the DSP, and a part of the pictures with equal size and the same eye distance are selected as training samples and samples to be recognized according to the size of the eye positioning and the face mark box. In the process of principal component analysis, the principal components are extracted to form the feature face space, the original samples are projected to a point in the space, and then sent to the KNN classifier for classification. The device is easy to carry, has low power consumption, and can be applied to other fields such as motion recognition and dynamic tracking through software design.

1 Face Detection Algorithm

The face detection system can be divided into face detection and face recognition modules, which can be further divided into four modules: face detection and positioning, normalization, feature extraction, and face recognition. Its detailed structure is shown in Figure 1.

Face recognition algorithm process

1.1 Face Location

It is an important step to determine the position of the face through the obtained samples, select the appropriate face, and cut it out as a sample. The quality of facial feature location and feature extraction has a direct impact on the effect of face image recognition. First, determine the coordinates of the human eye (x1, y1) and (x2, y2), from which the coordinates of the upper left vertex and the lower right vertex of the square face can be indirectly obtained. Let them be (X1, Y1) and (X2, Y2) respectively. The detailed calculation method is as follows

formula

In the formula, RH and RV are both empirical constants, which are taken as 2.0 and 3.5 respectively in the design process. In this way, the coordinates of the face area can be obtained in the original image, and its size varies with the size of the eye distance Widtheyes. However, as the input of PCA, the dimension of the input samples is required to be the same, so the image must be normalized. In the design, the obtained face area samples are scaled to 24×24. In addition, the image needs to be subjected to contrast adjustment and histogram equalization to improve the accuracy of recognition.

1.2 Face feature extraction

When designing a face recognition classifier, a picture is usually considered as a one-dimensional vector. Although this is different from the traditional way of considering a picture as a matrix, it can create favorable conditions for extracting eigenfaces using principal component analysis (PCA).

The method of feature face classification is to project an image to a point in a specific "face space". This "face space" consists of mutually orthogonal vectors. These vectors are important components for characterizing each face cluster. Images of different faces are far apart in this space, and different images of the same face are projected closer in this space. Therefore, the PCA method can be used to lay the foundation for the entire face recognition system.

In the first step, N samples are collected as training set X, and the sample mean m is calculated, as shown in formula (1):

formula

Among them, xi∈sample training set X=(x1,x2,…,xN).

The second step is to find the scatter matrix S, as shown in formula (2):

formula

According to the basic principle of PCA, the eigenvalue λi and the corresponding eigenvector ei of the scatter matrix must be found. Among them, ei is the principal component, and the size of its corresponding eigenvalue represents the amount of information it contains. Therefore, the eigenvalues need to be arranged from large to small in order of λ1, λ2, ... As shown in Figure 2, the left side is a face image reconstructed by the eigenvector corresponding to λ1, which can basically distinguish the outline of the face, and the right side is an image reconstructed by the eigenvector corresponding to λ100, which looks more like noise. If it is applied to the system, it is not conducive to recognition.

Reconstructing face images

Assuming that p values are taken out, λ1, λ2, …, λp can determine the face space E = (e1, e2, …, eP). In this face space, the point projected to the space by each element in the training sample X can be obtained by formula (3):

The above formula gives us a p-dimensional vector after the original vector is reduced in dimension by PCA. The next step is to input it into the KNN classifier for classification.

1.3 Construction of KNN classifier

The implementation of KNN is divided into two steps: training and recognition. During training, the result of dimensionality reduction of each type of sample is used as the input of KNN. The K nearest neighbor algorithm classifies a test point x into the category that appears most frequently among its K nearest neighbors, starting from the test sample point and continuously expanding the area until it includes the K training sample points, and classifies the category of the test sample point into the category with the highest frequency among the most recent K training sample points. As shown in Figure 3, the circle represents the location of the data to be recognized. When the K value is 3, the 3 data in the solid circle are selected, and the recognition result is the category represented by the triangle; when the K value is 5, the 5 data in the dotted circle are selected, and the recognition result is the category represented by the square. Therefore, the selection of an appropriate K value has a great impact on the classification result. If the K value is too large, it may be classified more correctly, but at the same time, the performance is sacrificed and the computational complexity is increased. If the K value is too small, the computational complexity is greatly reduced, but the accuracy of the classification may be affected.

KNN Classifier

2 System Hardware Design

DSP6713 is selected in the system design. This is a floating-point processor of the C6000 series produced by TI. It adopts VLIW architecture, has a low number of equivalent cycles for instruction operation, and runs faster. The image acquisition uses an ordinary camera with PAL format output and an image encoding chip TVP5147 produced by TI. The chip supports multiple formats and multiple interface inputs, and can output video data in YUV format, while providing line synchronization signals and vertical synchronization signals. Data temporary storage is implemented using CPLD and SRAM. The design system composition is shown in Figure 4.

System hardware design

2.1 TVP5147 chip

When the system is powered on, the DSP first initializes the TVP5147 through the I2C bus. The DSP has its own I2C bus controller. The chip I2C address is controlled by the level of the chip pin I2CA. If the pin is connected to a high level, the I2C write address is 0xB8, otherwise it is OxBB.

If the system is initialized to output 10-bit YUV mixed video data from the Y[9..O] port, it can be known that its output conforms to the following timing, as shown in Figure 5.

TVP5147 Output Timing

The first line in the figure is the DATACLK signal, which is the data clock signal provided by the TVP5147 chip, and the second line is the data Y[9…0]. Before each line of the image begins, there will be 4 SAV signals, and similarly, there will be 4 EAV signals after the end. As shown in Figure 5, the data is in YCbCr format, and the data of each pixel is composed of 4 data, Cb, Y, Cr, and Y at a time. As can be seen from Figure 5, when the AVID signal is high, it indicates that the current data is valid data. This provides a reference signal for the CPLD to collect valid data. At the same time, the TVP5147 chip also outputs the FID signal, which is the odd and even field indication signal.

2.2 CPLD reads and writes SRAM

The memory selected is DS1265AB, which is an SRAM memory with the advantages of fast storage speed and can save data for 10 years when the system is powered off. DS1265 has a capacity of 1 MB, 20 address lines, 8 data lines, and WE, OE, and CE signal input terminals.

CPLD uses EPM7128, which has the advantages of low price and high counting frequency. Connecting the memory SRAM to the IO pin of CPLD and coordinating the timing can meet the requirements of SRAM reading and writing. The schematic diagram is shown in Figure 6.

CPLD and SRAM connection diagram

The CPLD program is written so that the output timing meets the storage requirements of SRAM. Of course, for the specific requirements of the design, two SRAMs are used to store the data of the odd and even fields respectively. The switching of SRAM is realized by controlling multiple 74HC245s by the odd and even field signal FID. The detailed process is shown in Figure 7. When FID is at a high level, M1 and M4 are enabled. At this time, the CPLD inputs the address signal CPLDaddr into SRAM1, and the DSP inputs the address signal DSPaddr into SRAM2. At the same time, M6 and M8 are enabled. It can be seen from the figure that at this time, the CPLD is writing data to SRAM1, and the DSP is reading data from SRAM2. At the same time, M10 is enabled, and the CPLD signal CPLDctl controls the reading and writing of SRAM1, while the DSP signal DSPctl controls the reading and writing of SRAM2. When FID turns to a low level, SRAM1 and SRAM2 are exchanged. The system constructed in this way can record the two fields of video data at the same time, realizing the organic combination of CPLD and DSP. At this point, the task of the CPLD is to store valid image data in the corresponding SRAM. When the AVID pin of the TVP5147 chip rises, the address is set to the initial value 00h, that is, writing from the first address in sequence. At each rising edge of the data clock signal DATACLK, the Y[9…2] output by the TVP5147 is stored in the current address unit, and the Y0 and Y1 bits are discarded because the selected SRAM data bits are 8 bits. When the Y0 and Y1 bits are discarded, the accuracy of the image data is reduced, but the impact on the recognition effect is small. Then, with each rising edge of DATACLK, the CPLD adds 1 to the address unit, so that each field of data is written. When switching to another field of data, the execution process is the same, except that the stored object is forced to change by 74HC245. In this way, each field of data can be recorded.

SRAM switching circuit diagram

2.3 Design of image output system

In order to reduce the burden when designing the system, the TV monitoring method is adopted. The small TV is connected to the DSP bus through TI's video encoding chip THS8135. The obtained YUV data is directly output to the video receiving end of the TV AV through THS8135, and some information can be displayed on the TV screen through DSP, which makes the recognition process more user-friendly.

3 System Software Design

After the system hardware is successfully debugged, certain software algorithms need to be provided to achieve the combination of software and hardware. In this design, SRAM is extended on the DSP processor EMIF. The DSP processor stores the valid parity field data as two one-dimensional arrays respectively through the triggering of the read signal for processing.

3.1 DSP Image Preprocessing

The image data output by the TVP5147 chip is not in RGB format, but in YUV format. It needs to be converted into RGB format by the DSP processor before image preprocessing can be performed. The conversion formula is shown in formula (4):

formula

DSP reads the image data into the memory space, then performs operations on it, puts the obtained RGB into the corresponding storage units, and calculates the gray value Gray. The operation formula is shown in formula (5):

The final grayscale value is stored in the corresponding array. Each picture is composed of two fields, so the resolution of the complete picture is 720×576. However, the system itself does not need to convert every pixel, so 320×240 is intercepted and stored, so the resolution of each field is 320×120, which greatly reduces the time from YUV to grayscale image preprocessing and face positioning, and improves the performance of the system.

3.2 Face Recognition Process

After the face detection of the obtained 320×240 picture, the face part will be cut off as the face sample. When designing, all samples of the face will be displayed on the monitor, which reduces the possibility of false face detection and improves the accuracy of the system to a certain extent.

The sample resolution of the face is 24×24, and it is input to PCA as a one-dimensional vector of 576 dimensions. Figure 8(a) is a flowchart for calculating the PCA projection matrix, and Figure 8(b) is a workflow diagram of the KNN classifier, in which the values of the training samples after PCA projection do not need to be recalculated in each recognition. They can be used as calculations during initialization or stored in non-volatile media when power is off, such as Flash memory, which can improve the operating efficiency of the device and reduce the amount of calculation.

PCA and KNN flow chart

As shown in Figure 8, the KNN classifier can determine the closest classification, but it cannot reject the classification, so any person's face will be classified into a category in the built-in sample set. This classification method is not desirable, so it is necessary to add a rejection judgment, as shown in the flowchart in Figure 9.

Face classification flow chart

As shown in the flowchart, after the sample points are reduced in dimension by PCA, they are sent to the KNN classifier for classification. The result can be determined to be the Kth class. At this time, we should not rush to draw conclusions. We should first find the sum of the Euclidean distances between the test point and the sample points with K class labels. Define two thresholds a and b. If sum < a value, it is determined to be the first class; if sum > b value, it is determined to be the rejection class; if sum is between a and b values, introduce the accuracy control value, calculate the difference between sum and a, and if it is less than the accuracy control value, it is determined to be the Kth class, otherwise the classification is rejected. This process indirectly solves the problem of sample misclassification and inability to judge.

4 Test results

In this experiment, the value of a is selected as 12 400 and the value of b is selected as 16 200. The determination of these two values requires a lot of experiments to find out the rules. The value of x directly affects the recognition effect. In this paper, x=4 and x=5 are selected for testing.

(1) When x=4: When the program tested 36 face images belonging to 12 people in the recognizable database, it correctly identified 33 of them, the remaining 3 images were judged as negative, and 0 images were judged as wrong. When the program tested 33 face images belonging to 3 people in the unrecognizable database, 22 images were successfully judged as negative, and 11 images were misjudged;

(2) When x=5: When the program tested 36 face images belonging to 12 people in the recognizable database, it correctly identified 25 of them, the remaining 11 images were judged as negative, and 0 images were judged as wrong. When the program tested 33 face images belonging to 3 people in the unrecognizable database, 28 images were successfully judged as negative, and 5 images were judged as wrong.

From the analysis of the experimental data above, we can see that when x=4, the recognition rate of the recognizable library is 91.6%, and the rejection rate of the unrecognizable library is 66.7%. When x=5, the recognition rate of the recognizable library is 69.4%, and the rejection rate of the unrecognizable library is 84.8%. Therefore, different x values should be selected when applied to different occasions. When it is required to reject foreign faces as much as possible, the optional x value is 5, and when it is required to recognize known faces as much as possible, the optional x value is 4.

5 Conclusion

The construction of this face recognition system takes full account of its promotion. It does not use a USB camera as an image acquisition device, but replaces it with an analog camera of a universal video standard, so users can be more free when choosing a camera. At the same time, the device supports multiple interface inputs. In addition to the ordinary R-jack port, it also provides Svideo, YPbPr and RGB input methods. The recognition accuracy of the device can reach more than 90%, which basically meets the recognition requirements. The system has good real-time performance and is easy to carry. It can be promoted to dynamic image tracking, motion detection and other fields through program modification.

Keywords：DSP Reference address：Design of face recognition system based on DSP

Previous article：Implementation of EPP Mode Communication of Computer Parallel Port Based on CPLD
Next article：Design of Digital Frequency Meter Based on TMS320F2812

Recommended ReadingLatest update time:2024-11-16 19:53

Analysis of the Application of Memory Computing in DSP

Let's analyze what in- memory computing is , what application prospects it has in enterprise-level key business areas, and then see why Intel Xeon E7 can well support such applications, thus creating a broad "highway" for real-time processing of massive data. 1 In-memory computing: real-time process

[Embedded]

Analysis of the Application of Memory Computing in DSP

Research on DSP Remote Loading Technology Based on Ethernet

　　DSP has high-speed computing power and rich peripheral interfaces, and is widely used in embedded systems. Many DSP-based embedded systems are deployed in harsh environments or remote areas. When software upgrades or program updates are required, personnel cannot enter or are difficult to reach the corresponding env

[Embedded]

Research on DSP Remote Loading Technology Based on Ethernet

Design of data acquisition system based on ADmC812 micro-conversion chip and DSP chip TMS320F206

ADmC812 is ADI's new micro-converter with 8051 (8052) core as the control core. Because ADmC812 integrates a large number of peripheral devices. It itself is a fully programmable, self-calibrating, high-precision data acquisition system that can replace the traditional MCU+A/D+ROM+RAM high-cost, large-volume products,

[Microcontroller]

Design of data acquisition system based on ADmC812 micro-conversion chip and DSP chip TMS320F206

Design of USB port communication module between DSP and Bluetooth module

　　Circuit principle: When the DSP communicates with the Bluetooth module using the USB interface, it must go through the USB port conversion circuit and then connect to the USB bidirectional ports D+ and D- of the Bluetooth module; when the Bluetooth module USB port low-speed connection mode is used, the rate can als

[Power Management]

Design of USB port communication module between DSP and Bluetooth module

Design and Application Examples of DSP Virtual I2C Bus Software Package

introduction Digital signal processors (DSP) are increasingly used in various fields, among which TI (TEXAS INSTRUMENT)'s TMS320 series chips have taken a leading position. TMS320F206 (F206 for short) is favored in the development of mid-to-high-end instruments because it has 32K words of Flash on the chip,

[Embedded]

Design of encryption protection system for DSP program

At present, DSP has become a basic device in the fields of communication, computer, consumer electronics, etc. with its excellent performance and unique characteristics. At the same time, with the emphasis on intellectual property rights, when using DSP for product design, how to protect one's own achievements and p

[Embedded]

DSP processor power supply design

It is very important to design a good power supply for complex DSP processors. A good power supply should be able to handle dynamic load switching and control the noise and crosstalk present in high-speed processor designs. The constantly changing transients in DSP processors are caused by high switching freque

[Power Management]

Design of embedded four-axis motion controller based on ARM+DSP

Motion control systems have been widely used in the field of industrial control. In recent years, industrial control has placed higher and higher demands on motion control systems. Traditional PC-based and low-end microcontrollers have increasingly exposed problems such as high cost, high consumption, and low reli

[Microcontroller]

Design of embedded four-axis motion controller based on ARM+DSP

Popular Resources
Popular amplifiers