Using DSP to address speech recognition applications in 3G mobile phones Using DSP to address speech recognition applications in 3G mobile phones Keywords: DSP ASR 3G mobile phones with data rates up to 2Mb/s will be able to support multimedia applications including data services and Internet connections. As a result, people expect most 3G mobile phones to have larger screens and smaller keyboards. To avoid the inconvenience caused by small keyboards, voice dialing with the help of automatic speech recognition technology (ASR) has become a generally favored feature of 3G mobile phones. If ASR can take on this important task and satisfy consumers, it will eventually completely replace the small keyboard and be used in 3G mobile phones. From a design perspective, ASR relies on high-performance digital signal processor technology to complete the complex algorithms required for real-time operation and the realization of functionalities such as clarity and rapid recognition of voice formats. Fortunately, modern DSP technology has made great progress. It has achieved more powerful computing power, lower power consumption and smaller size than ever before, and more complex and more accurate ASR functions can be added to 3G mobile phones. Combining efficient and powerful DSP cores with other components and technologies is expected to achieve the channel processing solutions required by 3G mobile phones. At present, the basic applications of ASR can be divided into three categories according to their functions: speech-to-text (voice \'typing\'), speaker recognition and voice command control. These three types of functions cover a variety of ASR features that 3G devices will use. Applications of speech-to-text include voice dialing and e-mail dictation. Speaker recognition functions enable secure voice access to personal stored data and speaker identity information, which can be used for secure purposes such as credit card shopping and banking. Voice command control functions involve voice interface applications for Voice Extensible Markup Language (VXML) website content such as financial services, directory help, etc. (VXML is currently becoming a standardized voice tag for website content). From the implementation plan, 3G mobile phone ASR applications can be divided into two types: terminal-centric and client/server. As shown in Figure 1, in the terminal-centric application solution, the 3G mobile phone completes the entire speech recognition process and sends the recognition results. In the client/server solution, the terminal device completes the preprocessing and feature extraction, and then sends the obtained parameters to the central server through an error-proof data channel to complete the recognition process. With the client/server structure, the 3G mobile phone must use the data channel instead of the mobile...
You Might Like
Recommended ContentMore
Open source project More
Popular Components
Searched by Users
Just Take a LookMore
Trending Downloads
Trending ArticlesMore