Improve speech recognition in smartphones and tablets with high-performance voice capture SoCs-EEWORLD

Collect

In recent years, the market for mobile/portable devices such as smartphones and laptops has continued to grow rapidly. While these products continue to integrate more new features to enhance the user experience, there is still ample room for improvement in the user experience of basic voice communication functions, especially in improving voice clarity in noisy environments while maintaining the natural fidelity of voice. For example, when a user walks in a crowded commercial district, the surrounding environment may be filled with car horns, engine roars, construction noise, noisy crowd noise, footsteps and even wind noise. At this time, when making a voice call with a mobile phone, traditional technology is difficult to provide clear voice communication effects. In addition, manufacturers are also adding video call functions to emerging tablets. When using these mobile/portable devices for teleconferencing, the surrounding environment may also include a variety of noises, such as noisy office voices, surrounding conversations, computer noise, pen stroke noise, and glassware colliding, etc., and it is also not easy to provide clear call effects.

In these applications, different methods can be used to reduce or filter out environmental noise and improve voice communication effects, such as special noise reduction microphones, analog circuit noise reduction, or digital circuit noise reduction (see Table 1). These methods have their own characteristics. In comparison, the method of using digital circuit noise reduction is flexible, the acoustic design complexity is low, and the noise reduction effect is superior. Of course, in addition to providing good noise reduction effects, portable device designers also face a variety of design constraints and challenges, such as size, energy consumption, physical acoustic design, audio fidelity, and cost.

Table 1: Comparison of different noise reduction technologies

Advanced dual-microphone real-time adaptive noise reduction technology

ON Semiconductor has recently launched the BelaSigna R261 high-performance voice capture system-on-chip (SoC) based on digital circuit noise reduction technology. The device uses advanced dual-microphone noise reduction technology to help designs provide excellent noise reduction effects (see Figure 1). This advanced signal processing technology accepts signals from two microphones, can distinguish different types of signals, extract effective voice information and suppress environmental noise, thereby improving voice recognition.

Figure 1: BelaSigna R261 uses advanced real-time adaptive noise reduction algorithm

BelaSigna R261 has a speech extraction algorithm built into its integrated ROM memory. This algorithm uses one or more sensors to extract waveform propagation signals without knowing the sound source or sensor location in advance. This solution uses global optimization criteria and works in the frequency domain, time domain and spatial domain at the same time. It has no restrictions on the number of sound sources and sensors and is independent of the signal-to-noise ratio (SNR). That is, it can work equally well in low SNR and high SNR environments. It is very suitable for applications such as mobile phones and portable computers that need to extract useful speech signals from different noise domains.

This adaptive noise suppression algorithm provides 25 dB of noise suppression capability and can separate the required speech from the ambient noise in real time. It is suitable for speech from various speech sources and in various locations while ensuring natural sound quality (the sound is unnatural and not full after processing with other solutions) and can effectively work with microphones of various qualities.

BelaSigna R261 Key Features Analysis

BelaSigna R261 is a high-performance voice capture SoC that integrates a digital signal processor (DSP), a voltage regulator, a phase-locked loop (PLL), a level converter, and a ROM memory. Such a high level of integration can reduce the bill of materials (BOM) compared to other solutions. As shown in Figure 2, this device supports dual microphone direct input, the noise reduction algorithm is built into the integrated ROM memory, and the application controller based on the DSP structure provides high performance and ultra-low energy consumption, provides dual-pass analog output, and supports digital microphone output. In addition, the built-in power management module supports a supply voltage of 1.8 V to 3.3 V, the built-in on-chip PLL provides a variety of frequency options, and also provides an I2C interface.

Figure 2: BelaSigna R261 high-performance voice capture SoC functional architecture

It is particularly worth mentioning that the dual-microphone real-time adaptive noise reduction algorithm used by BelaSigna R261 provides two basic algorithm modes, namely long-distance pickup mode (algorithm mode 0) and close-distance pickup mode (algorithm mode 1). Algorithm mode 0 is optimized for long-distance pickup and can pick up voices up to 6 meters away while suppressing noise. It also supports 360-degree omnidirectional pickup, which is suitable for laptops, hands-free phones/conferences, or mobile phone hands-free call modes. In this mode, even if the user is not aiming at the microphone or even far away from the microphone, it can provide excellent voice clarity, thereby enhancing the user's freedom of use. Algorithm mode 1 is optimized for close-distance pickup. At this time, the user is very close to the microphone (less than 5 cm), that is, the voice is picked up at a close distance, effectively suppressing various environmental noises. It is suitable for mobile phones, learning machines, walkie-talkies and other devices that work in strong noise environments.

In addition to these two basic algorithm modes, BelaSigna R261 also provides a custom algorithm mode to help manufacturers meet specific application requirements. This algorithm mode supports special configurations and can be adjusted by loading new algorithm parameters through an external EEPROM or I2C control interface. The algorithm effect can be optimized according to specific applications, microphone types, locations and other system parameters.

Table 2: BelaSigna R261 supports different modes such as long-distance pickup, close-distance pickup and customization

As mentioned above, BelaSigna R261 provides high integration, built-in adaptive noise reduction algorithm, and can be directly connected to the digital microphone interface or the microphone input of the main chip (baseband processor). Therefore, in addition to supporting multiple pickup modes, another important advantage of this device is that it is easy to integrate into the design, which can minimize the time and engineering work required for design-in, because the design team does not need to develop or obtain algorithms, nor does it need to design complex support and interface circuits.

The device also enables cost-conscious OEMs to use two inexpensive (not necessarily matched) omnidirectional microphones in their designs, making microphone placement more flexible and eliminating the need to tune microphones on the production line, further saving time and cost. The SoC is available in an extremely compact 5.3 mm2 WLCSP package (available in 26- and 30-ball versions), which takes up much less board space than other alternatives and can be used in even the most space-constrained portable consumer electronics form factors. In addition, the device consumes very little power with a current consumption of 15 mA at 3.3 V.

BelaSigna R261 Application Design Points

Since the BelaSigna R261's ROM-based noise reduction algorithm is very flexible, there are many possible choices for microphone layout (physical acoustic design), but the default algorithm works optimally only when the microphones are arranged in the following way: 1) Both microphones face the user's mouth; 2) The midpoint of the two microphones is within 10 to 25 mm of each microphone. Of course, other microphone layout configurations can be used when using custom mode. [page]

In terms of circuit design, BelaSigna R261 is designed to support both digital and analog processing in a single system. Due to the nature of this mixed-signal circuit, careful design of printed circuit board (PCB) layout is critical to maintaining high audio fidelity. To avoid coupling noise into the audio signal path, keep digital signal traces away from analog signal traces. To avoid electrical feedback coupling, the input traces also need to be isolated from the output traces.

In terms of ground design, the ground plane should be divided into two parts, namely the analog ground plane (VSSA) and the digital ground plane (VSSD). The two ground planes should be connected together through a single point, the star connection point. The star connection point should be located at the ground end of the capacitor at the output of the power regulator. Of course, these are just some of the issues that designers need to pay attention to when applying BelaSigna R261 designs. Detailed design points can be found in Reference 2.

Summarize:

Designers of portable device audio systems need high-performance voice capture solutions that are easy to integrate into their systems while meeting their requirements for size, power consumption and cost. As a leading supplier of high-performance silicon solutions for energy-efficient electronic products, ON Semiconductor provides designers with a simple choice with the BelaSigna R261 high-performance voice capture SoC. This device has a high level of integration, built-in advanced adaptive noise reduction algorithms, and supports multiple voice pickup modes, enabling applications such as smartphones, walkie-talkies, notebooks and tablets to provide clear and comfortable voice communications. It has extremely high design flexibility, and its small size and low power consumption make it easy to choose low-cost microphones, allowing manufacturers of all types of portable consumer electronics products to significantly improve voice recognition and customer satisfaction, and accelerate the process of listing products.

Reference address：Improve speech recognition in smartphones and tablets with high-performance voice capture SoCs

Previous article：Research on harmonic control algorithm based on three-level SVPWM
Next article：System-level debugging using a multi-core wireless virtual system prototype