★ Community Points System ★ Must read for newbies ★ Moderator Application★

Forum»Forum › Analogue and Mixed Signal › Introduction to Speech Recognition Technology

1516 views|0 replies

3836 Posts	19 Resources

The OP

Published on 2018-8-27 22:19 Only look at the author

Introduction to Speech Recognition Technology [Copy link]

For beginners, the main learning is the general steps of speech recognition technology and several mainstream methods now. It mainly includes the following steps: 1) Preprocessing. The input speech signal is pre-emphasized and framed and windowed to filter out unimportant information and background noise, and endpoint detection is performed to determine the valid speech segment; 2) Feature extraction. Common feature parameters include amplitude, zero-crossing rate, energy based on the time domain, and linear prediction cepstral coefficients (LPCC) and Mel cepstral coefficients (MFCC) based on the frequency domain; 3) Pattern matching. There are several mainstream speech recognition technologies: 1) Dynamic Time Warping (DTW) technology. It uses dynamic warping method and combines time transformation relationship to obtain the distance between feature vectors. It is a classic algorithm in speech recognition. DTW technology is relatively easy to implement, but it cannot fully utilize the timing characteristics and dynamic characteristics of speech signals. Therefore, it is suitable for relatively simple Chinese speech recognition systems such as isolated words and small words. 2) Hidden Markov Model (HMM) technology. HMM uses the state in the Markov chain to represent the pronunciation process of speech. In the process of single word generation, the system transfers from one state to another, and generates an output in each state until the single word is output. HMM uses Markov chain to simulate the change process of signal, and indirectly describes this change through sequence, so it is a double random process, and can well describe the overall non-stationarity and short-term stationarity of speech signal. HMM needs to make a priori assumptions about the current state sequence distribution; it has weak modeling ability for high-level acoustic phonemes, making acoustically similar words easily confused; HMM speech recognition system is difficult to implement with hardware. 3) Artificial neural network (ANN) technology. Long training time. Difficulties of existing speech recognition: 1) Recognition performance depends on the surrounding environment. When the training environment and the test environment are different, the effect deteriorates; 2) Noise problem. How to denoise; 3) The ambiguity of speech information. How to identify words with similar pronunciations and words with the same pronunciation but different meanings.

This post is from Analogue and Mixed Signal

Guess Your Favourite

Pattern Recognition Cai Xuanping, National University of Defense Technology

TI's embedded processor solutions for speech recognition applications

Wireless sensor network and radio frequency identification technology RFID

Live replay: NXP NXP’s face recognition technology solution based on i.MX RT106F

Gesture recognition technology demonstration with new sensors and smart analog ICs

Just looking around

Deep Learning: Speech Recognition Technology Practice
"DeepLearning:SpeechRecognitionTechnologyPractice"introducesthepracticeofmultipleprogramminglanguagesincludingC#,Perl,Python,andJava,theuseandcodeanalysisoftheopensourcespeechrecognitiontoolkitKaldi,theconstructionofadeeplearningdevelopment ...
[Xingkong Board Python Programming Learning Main Control Board] Star Wars BB-8 Robot Controller Using Gyroscope and Voice Recognition
[i=s]ThispostwaslasteditedbyHonestQiaoon2023-1-322:12[/i]Tableofcontents:Zero,Introduction1.DesignPlanning2.Six-axissensor3.MicrophoneandVoiceRecognition4.Keyprocessing5.ControlmethodofBB-8robot6.ControlBB-8robotwiththeairboardBB-8RobotCont ...
GB 4824-2019 CISPR 11-2016
Industrial,scientificandmedicalequipment-Limitsandmethodsofmeasurementofradiofrequencydisturbancecharacteristics
Which company has reflow soldering? Can you help solder some components?
There are two main types of crystal oscillators: active crystal oscillators and passive crystal oscillators
EEWORLD University - TI gate drivers for every FET
Please help analyze the role of this voltage regulator diode in the circuit!
MEMS pressure sensors solve new pain points in human-machine interface
2021 Electronic Competition Topic--Nanjing University of Information Science and Technology On-campus Selection Competition Topic
[Learning Resources] Current Sense Amplifiers from Beginner to Advanced

Find a datasheet?

EEWorld Datasheet Technical Support

Hot tag

Related articles more>>

Huawei's Strategic Department Director Gai Gang: The cumulative installed base of open source Euler operating system exceeds 10 million sets
On November 15, it was reported that the first AI native open source operating system of openEule
Whether it is the electrification of automobiles or generative artificial intelligence, power tec
Wi-Fi 8 specification is on the way: 2.4/5/6GHz triple-band operation
MediaTek has released a white paper on its official website, outlining some details of the next-g
Vietnam's chip packaging and testing business is growing, and supply-side fragmentation is splitting the market
Apple faces class action lawsuit from 40 million UK iCloud users, faces $27.6 billion in claims
The US asked TSMC to restrict the export of high-end chips, and the Ministry of Commerce responded
ASML predicts that its revenue in 2030 will exceed 457 billion yuan! Gross profit margin 56-60%
Qualcomm launches its first RISC-V architecture programmable connectivity module QCC74xM, supporting Wi-Fi 6 and other protocols
It is reported that memory manufacturers are considering using flux-free bonding for HBM4 to further reduce the gap between layers
ON Semiconductor CEO Appears at Munich Electronica Show and Launches Treo Platform

New Posts

Featured

EEWorld
subscription
account

EEWorld
service
account

Automotive
development
circle

About Us Customer Service Contact Information Datasheet Sitemap LatestNews

User Search：

Room 1530, Zhongguancun MOOC Times Building, Block B, 18 Zhongguancun Street, Haidian District, Beijing 100190, China Tel:(010)82350740 Postcode：100190

Copyright © 2005-2024 EEWORLD.com.cn, Inc. All rights reserved 京B2-20211791 京ICP备10001474号-1 电信业务审批[2006]字第258号函

京公网安备 11010802033920号

WeChat Scan

快速回复返回顶部 Return list