Among the four modules of the voice control platform, the automatic speech recognition module is responsible for converting audio signals into text information. Its functions are relatively simple and have been introduced in the previous article. I will not repeat them here. The functions of semantic understanding, intention decision, and skill distribution/decision modules are relatively complex and are also the core capabilities of voice control. The subdivision functions of each module are shown in Figure 4. The semantic understanding module includes query analysis, scene classification, intention recognition, context recognition, template intervention, and slot extraction functions. After extracting keywords from a piece of voice information through the slot, the intention is classified according to different scenarios, and the intention is adjusted in combination with context understanding, so as to accurately determine the true intention of a sentence. With the slot extraction capability, when expanding new businesses, it can be separated from the dependence on the third-party skill language understanding ability, and realize flexible docking with third-party businesses. It can also train the corresponding slots according to business needs to facilitate the development of new businesses. At the same time, after the slots are subdivided according to the scenario, it can be customized for specific groups and usage scenarios, improving the accuracy of services and the operation conversion rate. The intention decision module includes multiple intention decisions, contextual decisions, personalized intervention, and user portrait generation. It mainly intervenes in the intention based on the user's usage habits and context, selects the intention that best matches the user scenario from multiple intentions, and improves the accuracy of the intention. The skill distribution/decision module selects the decision results through data models or manual intervention, thereby controlling the distribution of intents and achieving flexible docking with third-party content resources.
Figure 4 Voice control core module
4 Voice Central Control Platform Software Architecture
The voice control platform software is mainly divided into three layers in terms of architecture, namely the underlying technology layer, the core capability layer and the docking layer that requires secondary development. The hierarchical structure is shown in Figure 5. The underlying technology includes deep learning algorithms, speech recognition technology, natural language processing and basic data models. This part is the basic technology of intelligent speech, which is highly professional and generally does not require special customization. It can use mature technical solutions from third parties. The core capability layer includes scene classification, intent recognition, slot extraction, context judgment, decision and skill distribution, user portrait and personalized recommendation modules, covering all the core functions of voice cloud processing. The performance optimization of speech processing and the customized development of differentiated functions need to be implemented in this layer. The service docking, model training, decision configuration and data analysis modules above the core capability layer are used to dock specific businesses and services, and secondary development is required according to specific business needs. This layer needs to realize the flexible docking of multiple services, analyze business data and train models, and formulate appropriate decision mechanisms according to business types and user usage scenarios to complete the function matching of complex or multi-semantic statements.
5 Conclusion
This article provides a solution for building a private voice control platform for enterprises. In the entire voice link, the voice control occupies a pivotal position. By building a private voice control platform, third-party services and skills can be flexibly configured through the cloud without disturbing users, improving the speed of intelligent voice optimization iteration. Voice skills can also be customized according to specific business and user usage scenarios to create unique voice services for users. In addition, the use of a private voice control platform can more conveniently manage user data and ensure the security of voice data. Therefore, whether from the perspective of resource integration, performance improvement or business expansion, the establishment of a private control platform is the future trend of large enterprises.
References:
[1] Guo Jingjing. The significance of speech recognition technology development in promoting Mandarin[J]. Communication Research, 2020(18).
[2] Du Lingjun, Wu Xiaodao. Global patent layout trend of speech recognition technology[J]. Science and Technology China, 2021(12).
[3] Zhang Dalin, Ren Xuan, Xu Yimin, et al. Design and implementation of speech recognition technology for enterprise intranet system[J]. Digital Technology and Application, 2021(12).
[4] Yuan Bingqing, Yu Gan, Zhou Xia. A brief introduction to speech recognition technology[J]. Digital Communication World, 2020(02).
[5] Zhang Yu, Gao Lingyan, Hu Huan, et al. Research on the application of intelligent speech recognition technology in postal express lockers[J]. Electronic World, 2020(04).
[6] Li Boli. Mathematics in traditional computer speech recognition technology[J]. Fireworks Technology and Market, 2020(02).
[7] Hao Ouya, Wu Xuan, Liu Rongkai. Development status and application prospects of intelligent speech recognition technology[J]. Electroacoustic Technology, 2020(03).
[8] Peng Hongsong, Li Hongbin, Li Li, et al. Research on far-field speech recognition technology in artificial intelligence [J]. Digital Communication World, 2020(05).
[9] Yu Xiaoming. Development and application of speech recognition technology[J]. Computer Age, 2019(11).
[10] Tian Jianyong, Liu Song, Li Zhouyue, et al. Design analysis of intelligent voice reminder system[J]. Computer Knowledge and Technology, 2020(20).
[11] Li Yaming, Li Yang. Research on the application of artificial intelligence in the television industry in the era of smart media[J]. Publishing Wide Angle, 2019(03).
[12] Zhan Hongyan. Practice of artificial intelligence in television human-computer interaction[J]. Digital Technology and Application, 2019(03).
[13] Zhang Lanshan, Huang Gaoyuan. Opportunities and challenges brought by artificial intelligence technology to television media[J]. China Television, 2018(07).
[14] Hou Guangmin. Application of artificial intelligence in television human-computer interaction[J]. Cable TV Technology, 2017(11).
Previous article:How to analyze and optimize some background noise in mobile phone audio systems
Next article:Sensors in wearable devices: getting smaller and smaller, but bigger and bigger!
- Huawei's Strategic Department Director Gai Gang: The cumulative installed base of open source Euler operating system exceeds 10 million sets
- Analysis of the application of several common contact parts in high-voltage connectors of new energy vehicles
- Wiring harness durability test and contact voltage drop test method
- Sn-doped CuO nanostructure-based ethanol gas sensor for real-time drunk driving detection in vehicles
- Design considerations for automotive battery wiring harness
- Do you know all the various motors commonly used in automotive electronics?
- What are the functions of the Internet of Vehicles? What are the uses and benefits of the Internet of Vehicles?
- Power Inverter - A critical safety system for electric vehicles
- Analysis of the information security mechanism of AUTOSAR, the automotive embedded software framework
Professor at Beihang University, dedicated to promoting microcontrollers and embedded systems for over 20 years.
- Innolux's intelligent steer-by-wire solution makes cars smarter and safer
- 8051 MCU - Parity Check
- How to efficiently balance the sensitivity of tactile sensing interfaces
- What should I do if the servo motor shakes? What causes the servo motor to shake quickly?
- 【Brushless Motor】Analysis of three-phase BLDC motor and sharing of two popular development boards
- Midea Industrial Technology's subsidiaries Clou Electronics and Hekang New Energy jointly appeared at the Munich Battery Energy Storage Exhibition and Solar Energy Exhibition
- Guoxin Sichen | Application of ferroelectric memory PB85RS2MC in power battery management, with a capacity of 2M
- Analysis of common faults of frequency converter
- In a head-on competition with Qualcomm, what kind of cockpit products has Intel come up with?
- Dalian Rongke's all-vanadium liquid flow battery energy storage equipment industrialization project has entered the sprint stage before production
- Allegro MicroSystems Introduces Advanced Magnetic and Inductive Position Sensing Solutions at Electronica 2024
- Car key in the left hand, liveness detection radar in the right hand, UWB is imperative for cars!
- After a decade of rapid development, domestic CIS has entered the market
- Aegis Dagger Battery + Thor EM-i Super Hybrid, Geely New Energy has thrown out two "king bombs"
- A brief discussion on functional safety - fault, error, and failure
- In the smart car 2.0 cycle, these core industry chains are facing major opportunities!
- The United States and Japan are developing new batteries. CATL faces challenges? How should China's new energy battery industry respond?
- Murata launches high-precision 6-axis inertial sensor for automobiles
- Ford patents pre-charge alarm to help save costs and respond to emergencies
- New real-time microcontroller system from Texas Instruments enables smarter processing in automotive and industrial applications
- What challenges will the 5G era bring to the LED display industry?
- Targeted GaN R&D
- 【Silicon Labs BG22-EK4108A Bluetooth Development Review】+Create a Bluetooth program from scratch
- 【mmWave Studio】Several experiences in problem solving
- Pin constraints are lost in Gaoyun Yunyuan software
- I need a schematic diagram and test program for controlling a four-wire brushless motor with a 51 microcontroller and A4988
- 【GD32F310G-START】Hardware SPI driver ST7735
- [Mil MYD-YA15XC-T review] + oled screen display
- My EZ-cube is broken. The wrong HEX was burned when I upgraded the software. How can I burn the correct HEX back to D78F0730?
- 【TI recommended course】#What is I2C design tool? #