Design of face recognition system based on TMS320C6713-EEWORLD

Collect

In order to enable the face recognition algorithm to run quickly, TI's DSP processor was selected, and a keyboard module and a PAL format output module were added. The PAL video signal can be collected and processed independently from the PC, and face positioning, feature extraction and face recognition can be independently run. In terms of hardware, the system adopts a memory switching system, so that the image data cache and reading are independently and simultaneously executed by the CPLD and DSP, shortening the data processing cycle and ensuring the real-time operation of the system. The software design includes: face positioning, eye positioning, sample storage and face recognition. The samples are automatically selected by the DSP, and a part of the pictures with equal size and the same eye distance are selected as training samples and samples to be recognized according to the size of the eye positioning and the face mark box. In the process of principal component analysis, the principal components are extracted to form the feature face space, the original samples are projected to a point in the space, and then sent to the KNN classifier for classification. The device is easy to carry, has low power consumption, and can be applied to other fields such as motion recognition and dynamic tracking through software design.

1 Face Detection Algorithm

The face detection system can be divided into face detection and face recognition modules, which can be further divided into four modules: face detection and positioning, normalization, feature extraction, and face recognition. Its detailed structure is shown in Figure 1.

1.1 Face Location

It is an important step to determine the position of the face through the obtained samples, select the appropriate face, and cut it out as a sample. The quality of facial feature location and feature extraction has a direct impact on the effect of face image recognition. First, determine the coordinates of the human eye (x1, y1) and (x2, y2), from which the coordinates of the upper left vertex and the lower right vertex of the square face can be indirectly obtained. Let them be (X1, Y1) and (X2, Y2) respectively. The detailed calculation method is as follows

In the formula, RH and RV are both empirical constants, which are taken as 2.0 and 3.5 respectively in the design process. In this way, the coordinates of the face area can be obtained in the original image, and its size varies with the size of the eye distance Widtheyes. However, as the input of PCA, the dimension of the input samples is required to be the same, so the image must be normalized. In the design, the obtained face area samples are scaled to 24×24. In addition, the image needs to be subjected to contrast adjustment and histogram equalization to improve the accuracy of recognition.

1.2 Face feature extraction

When designing a face recognition classifier, a picture is usually considered as a one-dimensional vector. Although this is different from the traditional way of considering a picture as a matrix, it can create favorable conditions for extracting eigenfaces using principal component analysis (PCA).

The method of feature face classification is to project an image to a point in a specific "face space". This "face space" consists of mutually orthogonal vectors. These vectors are important components for characterizing each face cluster. Images of different faces are far apart in this space, and different images of the same face are projected closer in this space. Therefore, the PCA method can be used to lay the foundation for the entire face recognition system.

In the first step, N samples are collected as training set X, and the sample mean m is calculated, as shown in formula (1):

Among them, xi∈sample training set X=(x1,x2,…,xN).

The second step is to find the scatter matrix S, as shown in formula (2):

According to the basic principle of PCA, the eigenvalue λi and the corresponding eigenvector ei of the scatter matrix must be found. Among them, ei is the principal component, and the size of its corresponding eigenvalue represents the amount of information it contains. Therefore, the eigenvalues need to be arranged from large to small in order of λ1, λ2, ... As shown in Figure 2, the left side is a face image reconstructed by the eigenvector corresponding to λ1, which can basically distinguish the outline of the face, and the right side is an image reconstructed by the eigenvector corresponding to λ100, which looks more like noise. If it is applied to the system, it is not conducive to recognition.

Assuming that p values are taken out, λ1, λ2, …, λp can determine the face space E = (e1, e2, …, eP). In this face space, the point projected to the space by each element in the training sample X can be obtained by formula (3):

The above formula gives us a p-dimensional vector after the original vector is reduced in dimension by PCA. The next step is to input it into the KNN classifier for classification.

1.3 Construction of KNN classifier

[page]

The implementation of KNN is divided into two steps: training and recognition. During training, the result of dimensionality reduction of each type of sample is used as the input of KNN. The K nearest neighbor algorithm classifies a test point x into the category that appears most frequently among its K nearest neighbors, starting from the test sample point and continuously expanding the area until it includes the K training sample points, and classifies the category of the test sample point into the category with the highest frequency among the most recent K training sample points. As shown in Figure 3, the circle represents the location of the data to be recognized. When the K value is 3, the 3 data in the solid circle are selected, and the recognition result is the category represented by the triangle; when the K value is 5, the 5 data in the dotted circle are selected, and the recognition result is the category represented by the square. Therefore, the selection of an appropriate K value has a great impact on the classification result. If the K value is too large, it may be classified more correctly, but at the same time, the performance is sacrificed and the computational complexity is increased. If the K value is too small, the computational complexity is greatly reduced, but the accuracy of the classification may be affected.

2 System Hardware Design

TMS320C6713 is selected in the system design . This is a floating-point processor of the C6000 series produced by TI. It adopts VLIW architecture, has a low number of equivalent cycles for instruction operation, and runs faster. The image acquisition uses an ordinary camera with PAL format output and an image encoding chip TVP5147 produced by TI. The chip supports multiple formats and multiple interface inputs, and can output video data in YUV format, while providing line synchronization signals and vertical synchronization signals. Data temporary storage is implemented using CPLD and SRAM. The design system composition is shown in Figure 4.

2.1 TVP5147 chip

When the system is powered on, TMS320C6713 first initializes TVP5147 through the I2C bus. The DSP has its own I2C bus controller. The chip I2C address is controlled by the level of the chip pin I2CA. If the pin is connected to a high level, the I2C write address is 0xB8, otherwise it is OxBB.

If the system is initialized to output 10-bit YUV mixed video data from the Y[9..O] port, it can be known that its output conforms to the following timing, as shown in Figure 5.

The first line in the figure is the DATACLK signal, which is the data clock signal provided by the TVP5147 chip, and the second line is the data Y[9…0]. Before each line of the image begins, there will be 4 SAV signals, and similarly, there will be 4 EAV signals after the end. As shown in Figure 5, the data is in YCbCr format, and the data of each pixel is composed of 4 data, Cb, Y, Cr, and Y at a time. As can be seen from Figure 5, when the AVID signal is high, it indicates that the current data is valid data. This provides a reference signal for the CPLD to collect valid data. At the same time, the TVP5147 chip also outputs the FID signal, which is the odd and even field indication signal.

2.2 CPLD reads and writes SRAM

The memory selected is DS1265AB, which is an SRAM memory with the advantages of fast storage speed and can save data for 10 years when the system is powered off. DS1265 has a capacity of 1 MB, 20 address lines, 8 data lines, and WE, OE, and CE signal input terminals.

[page]

CPLD uses EPM7128, which has the advantages of low price and high counting frequency. Connecting the memory SRAM to the IO pin of CPLD and coordinating the timing can meet the requirements of SRAM reading and writing. The schematic diagram is shown in Figure 6.

The CPLD program is written so that the output timing meets the storage requirements of SRAM. Of course, for the specific requirements of the design, two SRAMs are used to store the data of the odd and even fields respectively. The switching of SRAM is realized by controlling multiple 74HC245s by the odd and even field signal FID. The detailed process is shown in Figure 7. When FID is at a high level, M1 and M4 are enabled. At this time, the CPLD inputs the address signal CPLDaddr into SRAM1, and the DSP inputs the address signal DSPaddr into SRAM2. At the same time, M6 and M8 are enabled. It can be seen from the figure that at this time, the CPLD is writing data to SRAM1, and the DSP is reading data from SRAM2. At the same time, M10 is enabled, and the CPLD signal CPLDctl controls the reading and writing of SRAM1, while the DSP signal DSPctl controls the reading and writing of SRAM2. When FID turns to a low level, SRAM1 and SRAM2 are exchanged. The system constructed in this way can record the two fields of video data at the same time, realizing the organic combination of CPLD and DSP. At this point, the task of the CPLD is to store valid image data in the corresponding SRAM. When the AVID pin of the TVP5147 chip rises, the address is set to the initial value 00h, that is, writing from the first address in sequence. At each rising edge of the data clock signal DATACLK, the Y[9…2] output by the TVP5147 is stored in the current address unit, and the Y0 and Y1 bits are discarded because the selected SRAM data bits are 8 bits. When the Y0 and Y1 bits are discarded, the accuracy of the image data is reduced, but the impact on the recognition effect is small. Then, with each rising edge of DATACLK, the CPLD adds 1 to the address unit, so that each field of data is written. When switching to another field of data, the execution process is the same, except that the stored object is forced to change by 74HC245. In this way, each field of data can be recorded.

2.3 Design of image output system

In order to reduce the burden when designing the system, the TV monitoring method is adopted. The small TV is connected to the DSP bus through TI's video encoding chip THS8135. The obtained YUV data is directly output to the video receiving end of the TV AV through THS8135, and some information can be displayed on the TV screen through DSP, which makes the recognition process more user-friendly.

3 System Software Design

After the system hardware is successfully debugged, certain software algorithms need to be provided to achieve the combination of software and hardware. In this design, SRAM is extended on the DSP processor EMIF. The DSP processor stores the valid parity field data as two one-dimensional arrays respectively through the triggering of the read signal for processing.

3.1 DSP Image Preprocessing

The image data output by the TVP5147 chip is not in RGB format, but in YUV format. It needs to be converted into RGB format by the DSP processor before image preprocessing can be performed. The conversion formula is shown in formula (4):

[page]

DSP reads the image data into the memory space, then performs operations on it, puts the obtained RGB into the corresponding storage units, and calculates the gray value Gray. The operation formula is shown in formula (5):

The final grayscale value is stored in the corresponding array. Each picture is composed of two fields, so the resolution of the complete picture is 720×576. However, the system itself does not need to convert every pixel, so 320×240 is intercepted and stored, so the resolution of each field is 320×120, which greatly reduces the time from YUV to grayscale image preprocessing and face positioning, and improves the performance of the system.

3.2 Face Recognition Process

After the face detection of the obtained 320×240 picture, the face part will be cut off as the face sample. When designing, all samples of the face will be displayed on the monitor, which reduces the possibility of false face detection and improves the accuracy of the system to a certain extent.

The sample resolution of the face is 24×24, and it is input to PCA as a one-dimensional vector of 576 dimensions. Figure 8(a) is a flowchart for calculating the PCA projection matrix, and Figure 8(b) is a workflow diagram of the KNN classifier, in which the values of the training samples after PCA projection do not need to be recalculated in each recognition. They can be used as calculations during initialization or stored in non-volatile media when power is off, such as Flash memory, which can improve the operating efficiency of the device and reduce the amount of calculation.

As shown in Figure 8, the KNN classifier can determine the closest classification, but it cannot reject the classification, so any person's face will be classified into a category in the built-in sample set. This classification method is not desirable, so it is necessary to add a rejection judgment, as shown in the flowchart in Figure 9.

As shown in the flowchart, after the sample points are reduced in dimension by PCA, they are sent to the KNN classifier for classification. The result can be determined to be the Kth class. At this time, we should not rush to draw conclusions. We should first find the sum of the Euclidean distances between the test point and the sample points with K class labels. Define two thresholds a and b. If sum < a value, it is determined to be the first class; if sum > b value, it is determined to be the rejection class; if sum is between a and b values, introduce the accuracy control value, calculate the difference between sum and a, and if it is less than the accuracy control value, it is determined to be the Kth class, otherwise the classification is rejected. This process indirectly solves the problem of sample misclassification and inability to judge.

4 Test results

In this experiment, the value of a is selected as 12 400 and the value of b is selected as 16 200. The determination of these two values requires a lot of experiments to find out the rules. The value of x directly affects the recognition effect. In this paper, x=4 and x=5 are selected for testing.

(1) When x=4: When the program tested 36 face images belonging to 12 people in the recognizable database, it correctly identified 33 of them, the remaining 3 images were judged as negative, and 0 images were judged as wrong. When the program tested 33 face images belonging to 3 people in the unrecognizable database, 22 images were successfully judged as negative, and 11 images were misjudged;

(2) When x=5: When the program tested 36 face images belonging to 12 people in the recognizable database, it correctly identified 25 of them, the remaining 11 images were judged as negative, and 0 images were judged as wrong. When the program tested 33 face images belonging to 3 people in the unrecognizable database, 28 images were successfully judged as negative, and 5 images were judged as wrong.

From the analysis of the experimental data above, we can see that when x=4, the recognition rate of the recognizable library is 91.6%, and the rejection rate of the unrecognizable library is 66.7%. When x=5, the recognition rate of the recognizable library is 69.4%, and the rejection rate of the unrecognizable library is 84.8%. Therefore, different x values should be selected when applied to different occasions. When it is required to reject foreign faces as much as possible, the optional x value is 5, and when it is required to recognize known faces as much as possible, the optional x value is 4.

5 Conclusion

The construction of this face recognition system takes full account of its promotion. It does not use a USB camera as an image acquisition device, but replaces it with an analog camera of a universal video standard, so users can be more free when choosing a camera. At the same time, the device supports multiple interface inputs. In addition to the ordinary R-jack port, it also provides Svideo, YPbPr and RGB input methods. The recognition accuracy of the device can reach more than 90%, which basically meets the recognition requirements. The system has good real-time performance and is easy to carry. It can be promoted to dynamic image tracking, motion detection and other fields through program modification.

Reference address：Design of face recognition system based on TMS320C6713

Previous article：Analysis of the application characteristics of access control systems in American universities
Next article：Application research of intelligent RFID reader in UHF band

Recommended ReadingLatest update time:2024-11-17 00:06

Simulation Debugging Based on DSP TMS320F206

Preface TMS320F206 (hereinafter referred to as F206) is a product in the C2XX series DSP. Its unique on-chip 32KB flash memory FLASH makes its simulation debugging unique. F206 has three independent storage spaces for program, data and I/O. Each storage space is 64K×16 bits, of which the on-chip dual-access

[Embedded]

Parallel implementation of ATR algorithm on DSP processor

Automatic target recognition (ATR) algorithms usually include algorithms for automatically detecting, tracking, identifying, and selecting attack points. The complexity of the battlefield environment and the continuous increase in target types have increased the amount of computation required by the ATR algorithm, s

[Embedded]

Research on Turbo Decoding and Its DSP Implementation

Turbo code is a major breakthrough in the field of error correction coding for communication systems in recent years. It has won the favor of many scholars with its superior performance close to the Shannon limit. This paper adopts an optimized decoding algorithm based on Max-Log-Map, optimizes key technologies such

[Industrial Control]

Research on Turbo Decoding and Its DSP Implementation

Design of Laser Marking Controller Based on TMS320F2812 DSP

As the application scope of laser marking machines continues to expand, the requirements for laser marking speed and accuracy are becoming higher and higher. TI (Texas Instruments)'s TMS320F2812DSP is a high-speed processor designed specifically for industrial control applications. It is of practical significance to

[Embedded]

Application of CPLD in the Design of Multifunctional Harmonic Analyzer

Application of CPLD in the Design of Multifunctional Harmonic Analyzer 1 Sampling method comparison When collecting data for three-phase voltage and current 6-channel analog quantities, there are generally two methods: ① Alternating sampling method of same-ph

[Analog Electronics]

Digital control system of arc welding inverter power supply based on DSP

Arc welding inverter power supply (also called arc welding inverter) is a new type of arc welding power supply that is efficient, energy-saving and lightweight. At present, ICBT is used as a power control device to improve the controllability and stability of the power main circuit, and 8-bit and 16-bit single-chip

[Embedded]

DSP Implementation of FIR Digital Filters of Different Orders

The structure of FIR filter is mainly non-recursive structure, without feedback from output to input. And FIR filter can easily obtain strict linear phase characteristics to avoid phase distortion of processed signal. Linear phase is only the time delay of h(n) in time domain, which is very important in waveform transm

[Test Measurement]

DSP Implementation of FIR Digital Filters of Different Orders

Power supply solutions for DSP

Preface This article describes a simple power supply solution. It uses synchronous buck conversion controllers such as the TPS56100, TPS5210, TPS56xx and TPS5602, targeting TI's C6000 DSP applications. At the same time, this article lists three power supply solutions: single voltage input system (5V or 12V), dual

[Power Management]

Popular Resources
Popular amplifiers