Design and implementation of voice-activated electronic notepad based on DSP-EEWORLD

Collect

　　After decades of development, speech recognition and speech coding and decoding technology have become increasingly mature and have entered the practical stage. Speech recognition technology has begun to be used in telephone inquiry services, smart toys, PDAs, home appliances, communications, industrial control, language learning and other application fields; its code-excited linear prediction (CELP) speech coding and decoding algorithm is also known for its good sound quality. , has a large compression ratio and has been widely used in communications and digital recording equipment. Compared with the handwriting input method, the voice input method has the advantages of simple operation, convenient search, and high recognition accuracy. It saves a lot of input time and reduces the complexity of information retrieval. The voice-activated electronic notepad system introduced in this article implements speech recognition and speech encoding and decoding on the same DSP chip, using speech instead of other methods for information input, which improves the flexibility of the system.

　　1 System functions

　　It can store 200 voice business cards. Each voice business card contains four pieces of information: name, phone number, work unit, and remarks. To search for voice business card information by voice control, the user only needs to dictate the name of the person he is looking for to obtain the phone number, work unit, and other information. Notes and other information, while the phone number is displayed on the LCD screen.

　　It has dual-tone multi-frequency (DTMF) dialing function. After the user uses voice control to find the phone number and displays it on the LCD screen, he only needs to touch a dialing key to achieve quick automatic dialing through the microphone of an ordinary telephone.

　　Digital recording, each recording is added with a voice mark; voice-controlled playback, you only need to dictate the voice mark of the desired voice segment during playback, and the recording will be automatically found and played back, with a total recording time of more than two and a half hours.

　　It has convenient editing functions, adding and deleting voice business cards, and adding and deleting digital recording segments.

　　2 System composition

　　The voice-activated electronic notepad system introduced in this article is implemented on a fixed-point 16-bit DSP chip. It includes a specific person isolated word speech recognition algorithm and an algebraic codebook-excited linear prediction (ACELP) speech codec algorithm (MFCC) for speech. Features, using dynamic time planning (DTW) as the template matching algorithm, for ordinary entries, the recognition rate reaches more than 99%. The ACELP algorithm used for the "recording" function is a codec algorithm with excellent performance at this bit rate. Although the algorithm complexity is high, the average subjective evaluation score (MOS score) of the reconstructed speech quality reaches 4.0, which is very close to compression. The previous score was 4.3.

　　As shown in Figure 1, the system is bus controlled by MCU, accepts external keyboard input, and displays information on the LCD screen. In this system, voice plays the role of most human-machine interfaces, eliminating complex keyboard operations as much as possible, and using voice prompts or voice playback to replace part of the LCD text prompts. Since the voice needs to be processed by the DSP, the MCU needs to frequently exchange information with the DSP in order to achieve a friendly operation interface.

　　2.1 DSP

　　DSP (ADSP2185) is the signal processing center of the entire hardware system. It completes speech recognition, training, encoding and decoding, performs data management and scheduling of on-chip RAM and external FLASH memory chips, and provides concise commands and feedback information to the main control chip MCU. . ADSP2185 is a product of Analog Devices. Its main performance is as follows:

　　The operation speed is 50MIPS, and they are all efficient single-cycle instructions.

　　Provides 80Kbyte of on-chip RAM, of which 32Kbyte is data RAM and 48Kbyte is program RAM.

　　Supports a maximum external storage area of 4Mbyte for storing data or programs.

　　Provides DMA support between the byte storage area (BM) and the on-chip RAM area.

　　Provides two serial ports that are programmable, full-duplex, and automatically perform transmit and receive buffer operations.

　　2.2 MCU

　　MCU (KS57C2316) is a cost-effective CMOS four-bit microcontroller produced by SAMSUNG. It can be programmed at one time and is suitable for small batch production. It is widely used in the control of household appliances. It has powerful I/O functions. Its main performance is as follows:

ROM is 16K; 8 bit;

RAM is 512; 4 bit

40 I/O pins

Up to 16-bit digital LCD display - 32 segments, 4 common pins. These features ensure the system's main control capability and can provide flexible external interfaces, suitable for further improvements and transformations. Additional functions such as calculators can be completed directly by the MCU.

　　2.3 Data FLASH memory

　　Data FLASH memory KM29U64000 is a product of SAMSUNG. It can operate at 3 volts with low power consumption. It has large storage capacity, low price, fast speed, and the stored data can be retained without losing it after power off. Its main performance indicators are as follows:

　　The memory cell array is (8M+256K;8bit;

　　Data register (512+16;8bit;

　　The page is written as (512+16) Byte;

　　Block erase is (8K+256) Byte;

　　Command/address/data multiplexing I/O port;

　　Reliable CMOS Floating-Gate technology, withstands one million writes/erases, data retention time is 10 years, command register operation

　　2.4 Other devices

　　This system also uses Analog Devices' Codec (A/D, D/A converter) AD73311L, and SST's FLASH memory SST29LE010 (128K% 26;#215;8bit) to store DSP programs and calculation data.

　　3 Interconnection and mutual control of main chips in the system

　　3.1 MCU and DSP

　　The MCU and DSP are connected through a serial port to transmit the command words from the MCU to the DSP and the feedback words from the DSP to the MCU. DSP serial port 1 is used for voice input and output, and serial port 2 is used to connect to the MCU. Since the DSP's serial port transceiver mode is inconsistent with the MCU's serial port, the MCU uses the I/O port to simulate the serial port to connect to the DSP. In order to ensure the stability of data transmission, MCU uses interrupt mode to receive. Each data transmission generally requires more than one byte, so it is transmitted in a packaged manner.

　　3.2 DSP and data FLASH memory

　　The interface between DSP and data memory is shown in Figure 2. The 8 bits in the data bus of the DSP are connected to the bus of the FLASH memory; the read and write control lines WR and RD of the DSP are used to control the reading and writing of the FLASH chip; the I/O port FL0 of the DSP is used to control the CE (chip) of the FLASH memory. Select) end to ensure that the DSP will not cause misoperations to the FLASH chip when performing other bus operations; use another I/O port PF3 of the DSP to connect to the "busy/busy" signal line of the FLASH chip to monitor its work Status; connect the two address lines A8 and A9 to the two latch controls CLE and ALE of the FLASH chip respectively to control the status of the bus.

　　3.3 DSP and program FLASH memory

　　The Byte storage area of ADSP2185 is an 8-bit wide external bidirectional storage space that can be used to store programs and data. The 4Mbyte storage space of the entire Byte storage area is composed of 256 16K%26;#215;8 pages. Byte storage area can only be accessed through BDMA. When working in BDMA mode, A0~A13 is used as the low-end address, and D16~D18 is used as the extended high-end address. They are jointly used to achieve 4Mbyte external addressing capability. D8~D15 serve as data buses, and BMS, RD, and WR are used to control the chip select and read and write operation signals of the memory respectively. The interface between DSP and program FLASH memory is shown in Figure 3. It is worth mentioning that the ADSP2185 development system provides support for the "reload" function. The basic idea is that when the on-chip RAM (program RAM or data RAM) is not enough, the main program can dynamically call in the required subroutines. The subroutine is in the program FLASH memory, and is transferred into the RAM area of the DSP only when it needs to be run. This is equivalent to using the software transfer method to expand the RAM area of the DSP. This performance provides convenience for the design of this system. This is because the speech recognition and encoding and decoding programs are relatively long and cannot be loaded into the DSP's RAM area at the same time. Therefore, this point must be used in system software design for dynamic program input.

　　4 System software design

　　This system uses dynamic program transfer to expand available hardware resources.

　　The collaborative work of several complex algorithms is realized on one chip; in terms of program structure, mixed programming of C language and assembly language is used, making full use of the computing speed of DSP and taking into account the flexibility of the program; the software system adopts clear The hierarchical structure and clear module division facilitate local modification and upgrade; the system also carefully designs the data storage structure of the memory based on the combination and affiliation of various parameters specified by different functions, and integrates program modules such as data storage, deletion, and search. Reasonable encapsulation is made for upper-layer programs to call.

　　4.1 Composition of system software modules

　　The software design of the system includes two parts: MCU software design and DSP software design. The MCU software mainly includes programs such as clock calendar, power monitoring control, scientific calculation, keyboard scanning, LCD display driver, and DSP communication, as shown in Figure 4. The DSP software mainly includes five functional modules: basic system IO, FALSH management, voice, G.723.1 encoding, and G.723 decoding, as shown in Figure 5. It is divided into 6 layers. The upper 3 layers are written in C language to enhance the flexibility of the program; the lower 3 layers are written in assembly language, mainly algorithms, system settings and peripheral device control.

　　5 Application prospects

　　The voice-activated electronic notepad system we successfully developed is the prototype of the future SPDA (Speech Personal Digital Assistant). It integrates technologies such as speech recognition, speech compression codec, speech signal processor DSP and large-capacity FLASH memory data management. The technology used in this system can be applied to voice dialing telephones, voice dialing + voice recording telephones, telephones, etc. Mobile companion (voice dialing, reporting, answering), PDA, Walkman, voice toys, voiceprint lock, voice portal, etc.

Reference address：Design and implementation of voice-activated electronic notepad based on DSP

Previous article：Fingerprint identification system based on TMS320VC5402
Next article：Implementation of multi-joint control system of bionic robot crab based on DSP