CC3200 wireless wifi processor full voice interactive control smart home system

Jacktang

CC3200 wireless wifi processor full voice interactive control smart home system [Copy link]

Table of contents

Chapter 1 System's main functions and module description 2

1.1 Main functions of the system 2

1.2 Module Description 3

Chapter 2 Main Chip 3

2.1 Main processor chip 3

2.2 Auxiliary chip 4

2.2.1 : LD3320 voice chip 4

2.2.2 cc3200 wireless wifi processor 4

Chapter 3 System Chip Pin Connection Diagram 5

Chapter 4 Key System Technologies and Software Design 6

4.1 Key System Technologies 6

4.1.1 Speech Recognition Technology 6

4.2.2 Endpoint Detection VAD ( Voice Activity Detection ) 7

4.1.3 Non-specific speech recognition 7

4.2 Software Design 8

Chapter 5 System Effect Display 10

Chapter 1 System's Main Functions and Module Description

1.1 Main functions of the system

With the progress of human society and the rapid development of science and technology, people have begun to pursue a more intelligent and comfortable home environment, and thus smart homes have developed rapidly. As an important part of smart homes, smart home control systems have also been put to the test, such as single control form, low user experience quality, lack of personalization and freedom, etc. For this reason, we have designed a full voice-controlled future home system.

1.2 Module Description

This work made breakthroughs in the following major difficulties during the design process

1 : The speech recognition rate is significantly improved through pre-synchronization and spam keywords

2 : Semantic understanding and synthesis

3 : Filtering algorithm design

At the same time, the existence of these difficulties has also promoted the following major innovations:

1 : Full voice control.

2 : Non-specific speech recognition. This work uses phonetic symbols to mark the key words to be recognized when recognizing key words. Therefore, it has the ability to recognize simple foreign languages and pure dialects. It is more intelligent.

3 : Customized passwords can be edited dynamically. Customers can edit customized identification passwords as they wish. More user-friendly.

4 : Strong scalability. This work uses the CC3200 wireless microcontroller to pave the way for more functional expansions such as wireless control and connection to cloud platforms.

Chapter 2 Main Chips

2.1 Main processor chip

The main processor uses STM32F103ZET6

2.2 Auxiliary chip

2.2.1 : LD3320 voice chip

The LD3320 chip is a special chip for " speech recognition " , designed and produced by ICRoute . The chip integrates a speech recognition processor and some external circuits, including AD , DA converters, microphone interface, sound output interface, etc. The chip is designed to focus on energy saving and high efficiency. It does not require any external auxiliary chips such as Flash , RAM , etc. It can be directly integrated into existing products to realize speech recognition / voice control / human-computer dialogue functions. In addition, the list of recognized key words can be edited dynamically.

2.2.2 CC3200 Wireless WiFi Processor

The SimpleLink CC3200 device for Internet of Things (IoT) applications is a wireless MCU that integrates a high-performance ARM Cortex-M4 MCU , enabling customers to develop an entire application with a single integrated circuit (IC) . With on-chip Wi-Fi , Internet, and robust security protocols, faster development is possible without the need for previous Wi-Fi experience.

Here we mainly use it for future wireless control expansion.

Chapter 3 System Chip Pin Connection Diagram

Chapter 4 Key System Technologies and Software Design

4.1 Key technologies of the system

4.1.1 Speech Recognition Technology

The recognition technology based on " keyword list " is adopted : ASR (Auto speech recognition) technology . That is, the sound input through MIC is subjected to spectrum analysis -> voice features are extracted -> compared and matched with the key words in the keyword list -> the keyword with the highest score is found as the recognition result output.

4.1.2 Endpoint Detection VAD ( Voice Activity Detection )

VAD ( Voice Activity Detection ) technology determines the time point when the human voice begins and ends in a voice data stream. The basis for judgment is that the sound begins when there is a voice pronunciation based on the background sound. Then, when a certain duration of background sound (for example, 600 milliseconds) is detected, it is considered that the human voice ends. After the area where the human voice is speaking is determined
by VAD , the voice recognition chip will recognize and process the sound data during this period and calculate the recognition result.

4.1.3 Non-specific speech recognition

Speech recognition recognizes "speech". For non-specific person speech recognition, when describing key words, the key words to be recognized are marked with phonetic symbols.
For the Chinese recognition currently supported by LD3320 , the key words are described with pinyin. In other words, as long as the pronunciation can be spelled out by pinyin, it can be input into the chip and recognized.
Therefore, when it is necessary to recognize some simple foreign language or pure dialect pronunciation in some occasions, it can be achieved by pinyin marking.

4.2 Software Design

Chapter 5 System Effect Display