Speech signal recognition using fixed-point DSP processing chip ADSP2181

Publisher:数字狂舞Latest update time:2012-07-03 Source: 21icKeywords:DSP  ADSP2181 Reading articles on mobile phones Scan QR code
Read articles on your mobile phone anytime, anywhere

In recent years, the rapid development of high-performance digital signal processing chip DSP (Digital Signal Process) technology has made it possible to realize real-time speech recognition. Among them, AD's digital signal processing chips are widely used in various fields due to their good cost performance and code portability. Therefore, we use AD's fixed-point DSP processing chip ADSP2181 to realize speech signal recognition.

1 Basic process of speech recognition

According to different applications in practice, speech recognition systems can be divided into: recognition of specific people and non-specific people, recognition of independent words and continuous words, recognition of small vocabulary, large vocabulary and unlimited vocabulary. However, no matter which speech recognition system is used, its basic principles and processing methods are generally similar. The schematic diagram of a typical speech recognition system is shown in Figure 1.


The speech recognition process mainly includes speech signal preprocessing, feature extraction, and pattern matching. Preprocessing includes pre-filtering, sampling and quantization, windowing, endpoint detection, pre-emphasis, and other processes. The most important part of speech signal recognition is feature parameter extraction. The extracted feature parameters must meet the following requirements:

(1) The extracted feature parameters can effectively represent the speech features and have good distinguishability;

(2) There is good independence between the parameters of each order;

(3) The feature parameters should be easy to calculate, and it is best to have an efficient algorithm to ensure real-time implementation of speech recognition.

In the training phase, after processing the feature parameters, a model is built for each entry and saved as a template library. In the recognition phase, the speech signal passes through the same channel to obtain the speech feature parameters, generate a test template, match it with the reference template, and use the reference template with the highest matching score as the recognition result. At the same time, the recognition accuracy can also be improved with the help of a lot of prior knowledge.

2 System Hardware Structure

2.1 Features of ADSP2181

AD's DSP processing chip ADSP2181 is a 16b fixed-point DSP chip with large internal storage space, strong computing function and strong interface capability. It has the following main features:

(1) Adopting Harvard structure, external 16.67MHz crystal oscillator, instruction cycle is 30ns, instruction speed is 33MI/s, and all instructions are executed in single cycle;

(2) 80 kB of memory integrated on the chip: 16 kB words (24 bits) of program memory and 16 kB words (16 bits) of data memory;

(3) There are three independent computing units inside: arithmetic logic unit (ALU), multiplier accumulator (MAC) and barrel shifter (SHIFT), among which the multiplier accumulator supports multiple precision and automatic unbiased rounding;

(4) A 16-bit internal DMA port (1DMA) for high-speed access to on-chip memory; an 8-bit bootstrap DMA (BDMA) port for loading data and programs from the bootstrap program memory;

(5) 6 external interrupts, and the priority or mask can be set, etc.

Due to the above characteristics of ADSP2181, the system composed of this chip is small in size, high in performance, low in cost and power consumption, and can better implement the speech recognition algorithm.

2.2 System Hardware Structure

When constructing the speech recognition circuit, we adopted the master-slave structure design of ADSP2181, and the CPU loaded the program through the IDMA port. The hardware structure of the speech recognition system is shown in Figure 2.

In this structure, the PC is the master CPU and the ADSP2181 is the slave CPU. The PC loads the program into the internal memory of the ADSP2181 through the IDMA port. The PC bus is decoded by the CPLD to form control signals such as IRD, IWR, IAL, IS, etc., which are connected to the IDMA port of the ADSP2181. In this way, when the ADSP2181 is running at full speed, the host can query the running status of the slave and access all the program memory and data memory inside the ADSP2181. This greatly facilitates the compilation and debugging of the program, as well as the real-time processing of voice signals.

3 DSP Implementation Technology of Speech Recognition

3.1 Fixed-point implementation of floating-point operations

There are many floating-point operations in the speech recognition algorithm. Using fixed-point DSP to implement floating-point operations is the first problem that needs to be solved in writing speech recognition programs. This problem can be solved by the calibration method of numbers. The calibration of numbers is to determine the position of the decimal point in the fixed-point number. Q representation is a commonly used calibration method. Its representation mechanism is:

The set point number is J, the floating point number is)/, then the conversion relationship between the fixed point number and the floating point number represented by the Q method is:

Floating point number)/convert to fixed point number x: x= (int)y×2Q;

Convert fixed-point number z to floating-point number y: y = (float)x×2-Q. [page]

3.2 Data accuracy processing

When using a 16b fixed-point DSP to implement a speech recognition algorithm, although the program's running speed is improved, the data accuracy is relatively low. This may be due to the accumulated errors in the intermediate process, which may cause incorrect calculation results. In order to improve the calculation accuracy of the data, the following processing methods are used in the program:

(1) Extended Precision

In places where high precision is required, the intermediate variables of the calculation are represented by 32 bits or even 48 bits. In this way, the calculation precision is greatly improved without increasing the number of instructions by a small amount.

(2) Using pseudo-floating point method to represent floating point numbers

The pseudo-floating point method uses the mantissa + exponent method to represent floating point numbers. In this case, the mantissa of the data block can use the Q1.15 data format, and the exponent of the data block is the same. This method of representing data has a large enough data range and can fully meet the requirements of data accuracy, but it requires writing a set of exponent and mantissa operation libraries by yourself, which will increase the number of program instructions and the amount of calculation, which is not conducive to real-time implementation.

Both of the above methods can improve calculation accuracy, but in actual operation, a trade-off should be made based on system requirements and algorithm complexity.

3.3 Variable Maintenance

In high-level languages, there is a difference between global variables and local variables storage, but in DSP programs, all declared variables are allocated to data space when linking. Therefore, if local variables are defined in the same way as in high-level languages, a lot of DSP storage space will be wasted, which is obviously unreasonable for fixed-point DSPs with limited data space. In order to save storage space, it is best to maintain a variable table when writing DSP programs. When entering each DSP submodule, do not rush to allocate new local variables, and give priority to variables that have been allocated but not used. New local variables should only be allocated when there are not enough.

3.4 Handling of nested loops

Many implementations of speech recognition algorithms are implemented in loops. When processing loops, you need to pay attention to the following issues:

(1) In the ADSP2100 series DSP chips, loop nesting cannot exceed 4 levels at most, otherwise a stack overflow will occur, causing the program to fail to execute correctly. However, in the DSP program for speech recognition, the nested programs, including interrupts, often exceed 4 levels. In this case, you cannot use the do...until...instructions provided by the DSP. You can only design some loop variables and maintain these variables yourself. Since the DSP loop stack is not used at this time, it will not cause a stack overflow. In addition, if you use the jump instruction to jump out of the loop instruction, you must maintain the pointers of the three stacks, PC, LOOP, and CNTR.

(2) Try to reduce the number of instructions in the loop body. In multiple loops, reducing the number of instructions helps reduce the number of times the program is executed. This helps reduce the execution time of the program and improve the real-time performance of the operation.

3.5 Adopt a modular programming approach

In the implementation of speech recognition algorithm, in order to facilitate the design and debugging of the program, a modular programming method is adopted. The module division is based on the basic process of speech recognition, and each module is further divided into several sub-modules, and then programming and debugging are carried out in modules. Before writing the program, the algorithm of each module is first simulated in a high-level language, and then the assembly program is written on this basis. When debugging, the debugging method of comparing high-level language with assembly language can be used. In this way, the correctness of the assembly language can be verified by tracking the intermediate state between high-level language and assembly language, and errors can be discovered and corrected in time, shortening the programming cycle. In addition, in the process of writing the program, necessary comments and instructions should be added to the key parts to enhance the readability of the program.

During the overall adjustment, it is necessary to set the corresponding population parameters and export parameters in each module, maintain the stack pointer and intermediate variables, etc.

3.6 Mixed Programming Using C and Assembly Language

Now, most DSP chips support mixed programming of assembly language and C or C++ language, and ADSP2181 is no exception. Using C language to develop DSP programs has the advantages of shortening the development cycle and reducing program complexity. However, the execution efficiency of the program is not high, and it will increase additional machine cycles, which is not conducive to the real-time implementation of the program. For this reason, when writing the speech recognition algorithm in C language, we use fixed-point processing technology. ADSP2181 is a 16-bit fixed-point processor. The following issues should be noted in fixed-point processing:

(1) ADSP2181 supports both decimal and integer calculation modes. The decimal mode should be selected during calculation so that the absolute value of the calculation result is less than 1;

(2) Use double-word fixed-point arithmetic library instead of C language floating-point library to improve calculation accuracy;

(3) Pay attention to performing saturation operations after each multiplication and addition operation to prevent overflow and underflow of the result;

(4) After the loop processing, a set of data may have different exponents and needs to be normalized so that the subsequent fixed-point operations can process the exponent and mantissa separately.

4 Conclusion

The speech recognition system composed of fixed-point DSP chips has a wide range of application prospects. When writing speech recognition algorithms, fixed-point processing and some principles and methods are also of practical guiding significance to other similar algorithms. In practical applications, attention should be paid to optimizing the algorithm according to the characteristics of the DSP chip so that the performance of the DSP chip can be fully utilized.

Keywords:DSP  ADSP2181 Reference address:Speech signal recognition using fixed-point DSP processing chip ADSP2181

Previous article:Embedded Systems Implementing Analog I/O in Configurable Systems
Next article:Design of motion estimation based on FPGA

Recommended ReadingLatest update time:2024-11-16 21:38

Design of DSP external E2PROM interface based on SPI
0 Introduction In recent years, with the popularization of DSP technology and the emergence of high-performance DSP chips, DSP has been increasingly accepted by engineers and widely used in various fields, such as speech processing, image processing, pattern recognition and industrial control, and has incre
[Embedded]
Design of Intelligent Instrument System Based on ARM and DSP
1 Introduction With the increasing requirements of intelligent instruments and control systems for real-time signal processing and the rapid development of large-scale integrated circuit technology, there is an increasingly urgent need for a high-performance design solution to adapt to it. Combining DSP technology and
[Microcontroller]
Design of Intelligent Instrument System Based on ARM and DSP
Embedded Development: DSP Sound Acquisition System Hardware Design
  1 Introduction   Sound signals are everywhere and contain a lot of information. In daily production and life, we can simplify the process and get the results we want by analyzing sound signals. With the continuous increase in the cost performance of DSP chips, DSP has been expanded from the military field to the ci
[Embedded]
Embedded Development: DSP Sound Acquisition System Hardware Design
Using FPGA to solve DSP design challenges
DSPs are important in electronic system design because they can quickly measure, filter, or compress real-time analog signals. In this way, DSPs help enable the communication between the digital world and the real (analog) world. But as electronic systems become more sophisticated and need to process multiple analog
[Embedded]
Using FPGA to solve DSP design challenges
Design of 256PPM modulation based on TMS320C5410
introduction FSO (Free space optical) refers to the communication between two or more terminals using laser beams transmitted in space as information carriers. It includes laser communications between deep space, synchronous orbit, low orbit, medium orbit satellites, and between the ground and satellites. I
[Embedded]
Testing and analysis methods of 8051, ARM and DSP instruction cycles
In real-time control systems, the most important indicator for selecting a microcontroller is the calculation speed. The instruction cycle is an important indicator that reflects the calculation speed. For this reason, this paper analyzes and tests the instruction cycles of three most representative microcontrollers (
[Microcontroller]
Testing and analysis methods of 8051, ARM and DSP instruction cycles
Design of dynamic liquid level depth test system based on DSP
1 Introduction   The dynamic liquid level depth is the distance from the wellhead of the oil well to the surface of the oil layer downhole, and is an important parameter in the regular testing of the pumping well. The average sound velocity in the well pipe can also be calculated from the dynamic liquid level depth.
[Test Measurement]
Design of dynamic liquid level depth test system based on DSP
Three-phase SPWM variable frequency power supply using DSP TMS320F28335
As an important part of the power supply system, the performance of variable frequency power supply is directly related to the safety and reliability indicators of the whole system. Modern variable frequency power supply is favored for its significant advantages such as low power consumption, high efficienc
[Power Management]
Three-phase SPWM variable frequency power supply using DSP TMS320F28335
Latest Embedded Articles
Change More Related Popular Components

EEWorld
subscription
account

EEWorld
service
account

Automotive
development
circle

About Us Customer Service Contact Information Datasheet Sitemap LatestNews


Room 1530, 15th Floor, Building B, No.18 Zhongguancun Street, Haidian District, Beijing, Postal Code: 100190 China Telephone: 008610 8235 0740

Copyright © 2005-2024 EEWORLD.com.cn, Inc. All rights reserved 京ICP证060456号 京ICP备10001474号-1 电信业务审批[2006]字第258号函 京公网安备 11010802033920号