2656 views|0 replies

149

Posts

2

Resources
The OP
 

[Anxinke ESP32 Voice Development Board Special Topic ①] ESP32-A1S Audio Development Board Offline Voice Recognition Control LED Light [Copy link]

 

Reprint: Anxinke open source team is dedicated to writing and sharing the initial technical exchange of ESP32-A1S offline voice

1. Introduction
2. Offline speech framework
2.1 Algorithm model WakeNet and recognition model MultiNet
2.2 Wake-up word recognition
2.3.2 Voice command recognition
3. Development board, compilation, experience
3.1 ESP32-A1S development board
3.2 Pull code and specify idf path
3.3 Compile and control the LED light on the development board


I. Introduction

Offline voice, as the name suggests: when not connected to the network, the product can recognize voice commands and execute corresponding control outputs.

Essence's ESP32-A1S development board intelligent voice assistant based on Essence's ESP32 chip supports wake-up word engine (WakeNet), offline voice command recognition engine (MultiNet) and front-end acoustic algorithm. ESP32-A1S combines ESP32 with artificial intelligence (AI) voice recognition and is part of Essence's complete AIoT solution.

Below, using my development notes, I will explain how to use the ESP32-A1S development board to easily achieve offline voice control of LED lights.



2. Offline Speech Framework
2.1 Algorithm Model WakeNet and Recognition Model MultiNet

Since it is voice wake-up and local recognition, it is inseparable from the algorithm model and recognition model, and ESP32-A1S is based on Espressif's warehouse esp_sr. esp_sr provides algorithm models related to voice recognition, which currently mainly include three modules:

Wake-up word recognition model

链接已隐藏,如需查看请登录或者注册

voice command recognition model
链接已隐藏,如需查看请登录或者注册

acoustic algorithm: integrates echo cancellation AEC (Acoustic Echo Cancellation), automatic gain adjustment AGC (automatic_gain_control), noise suppression NS (Noise Suppression), voice activity detection VAD (Voice Activity Detection) and microphone array algorithm (Mic Array Processing).

2.2 Wake-up word recognition

The wake-up word model WakeNet is committed to providing a high-performance model with low resource consumption, supporting the recognition of wake-up words such as "Alexa", "Tmall Genie", and "Xiao Ai".

Currently, the wake-up words for ESP32 are limited to the following: "Hi, Lexin", "Hello Xiaozhi", "Hello Xiaoxin", "hi, Jeson" and other wake-up words.
2.3.2 Voice Command Recognition

The command word recognition model MultiNet is dedicated to providing a flexible offline voice command recognition framework. Users can easily customize voice commands according to their needs without retraining the model.

Currently, the model supports the recognition of Chinese command words such as "turn on the air conditioner" and "turn on the bedroom light". The maximum number of custom voice command words is 100.

English command word definitions will be supported in the next version.
3. Development board, compilation, experience
3.1 ESP32-A1S development board

Because it is developed for esp32 chip, the firmware is developed based on ESP32 SDK. The warehouse code has developed a demo that supports simple control of ESP32-A1S development board.

ESP32-A1S official website development technical documentation: click

3.2 Pull code and specify idf path

Taking the Linux development environment as an example, compilation and development are faster and more convenient!

ESP32 Linux development environment setup tutorial link: click

Pull the code and clone it with git. This operation pulls the entire SDK project, but the process will take a long time because the SDK's - esp-sr related components are on GitHub, and GitHub is a foreign website, so just be patient!

git clone --recursive https://github.com/Ai-Thinker-Open/Ai-Thinker-Open_ESP32-A1S_ASR_SDK.git

Specify the dependent idf to compile in esp-skainet-AI, for example:

export IDF_PATH=~/esp32/esp-skainet-AI/esp-idf/

Put the toolchain required for compilation under ~/.bashrc

3.3 Compile and control the LED lights on the development board

Open examples/Smart_home_scene_AI in the project and configure it through make menuconfig. Select the development board as ESP32-A1S.

Then start compiling!

sudo chmod 777 /dev/ttyUSB0
make flash monitor -8j

Compiled successfully↓

Then, say "Hello, Xiaozhi" to the development board, and the red indicator light will light up, indicating that the wake-up is successful. Then say "Turn on the lights in the living room", and the lights in the living room will light up. . . . .

The list of words that can be recognized locally shows the pinyin of each word:

[0;32mI (321) MN: ---------------------SPEECH COMMANDS---------------------
[0;32mI (328) MN: Command ID0, phrase 0: da kai yi hao deng
[0;32mI (333) MN: Command ID1, phrase 1: da kai er hao deng
[0;32mI (339) MN: Command ID2, phrase 2: da kai san hao deng
[0;32mI (345) MN: Command ID3, phrase 3: da kai si hao deng
[0;32mI (351) MN: Command ID4, phrase 4: da kai wu hao deng
[0;32mI (356) MN: Command ID5, phrase 5: da kai ke ting de deng
[0;32mI (363) MN: Command ID6, phrase 6: guan bi ke ting de deng
[0;32mI (369) MN: Command ID7, phrase 7: da kai wo shi de deng
[0;32mI (375) MN: Command ID8, phrase 8: guan bi wo shi de deng
[0;32mI (381) MN: Command ID9, phrase 9: da kai chu fang de deng
[0;32mI (387) MN: Command ID10, phrase 10: guan bi chu fang de deng
[0;32mI (393) MN: Command ID11, phrase 11: da kai zou lang de deng
[0;32mI (400) MN: Command ID12, phrase 12: guan bi zou lang de deng
[0;32mI (406) MN: Command ID13, phrase 13: da kai ce suo de deng
[0;32mI (412) MN: Command ID14, phrase 14: guan bi ce suo de deng
[0;32mI (419) MN: Command ID15, phrase 15: da kai wei sheng jian de deng
[0;32mI (425) MN: Command ID16, phrase 16: guan bi wei sheng jian de deng
[0;32mI (432) MN: Command ID17, phrase 17: da kai quan bu de deng
[0;32mI (439) MN: Command ID18, phrase 18: guan bi quan bu de deng
[0;32mI (445) MN: Command ID19, phrase 19: quan bu da kai
[0;32mI (450) MN: Command ID20, phrase 20: quan bu guan bi
[0;32mI (457) MN: Command ID83, phrase 21: quan bu
[0;32mI (461) MN: Command ID84, phrase 22: guan bi wu hao deng
[0;32mI (467) MN: Command ID85, phrase 23: guan bi si hao deng
[0;32mI (473) MN: Command ID86, phrase 24: guan bi san hao deng
[0;32mI (479) MN: Command ID87, phrase 25: guan bi er hao deng
[0;32mI (485) MN: Command ID88, phrase 26: guan bi yi hao deng
[0;32mI (491) MN: Command ID89, phrase 27: guan bi tai deng
[0;32mI (497) MN: Command ID90, phrase 28: da kai tai deng
[0;32mI (502) MN: Command ID91, phrase 29: guan bi shu fang de deng
[0;32mI (509) MN: Command ID92, phrase 30: da kai shu fang de deng
[0;32mI (515) MN: Command ID93, phrase 31: guan bi
[0;32mI (520) MN: Command ID94, phrase 32: da kai
[0;32mI (525) MN: Command ID95, phrase 33: da kai shi hao deng
[0;32mI (531) MN: Command ID96, phrase 34: da kai jiu hao deng
[0;32mI (537) MN: Command ID97, phrase 35: da kai ba hao deng
[0;32mI (543) MN: Command ID98, phrase 36: da kai qi hao deng
[0;32mI (549) MN: Command ID99, phrase 37: da kai liu hao deng
---------------------------------------------------------

The processing code is as follows:

bool speech_commands_action(int command_id)
{
    printf("Commands ID: %d.\n", command_id);
    switch (command_id)
    {
    case 5:
        printf("打开客厅的灯\n");
        open_light(50000);
        break;
    case 6:
        printf("关闭客厅的灯\n");
        close_light(50000);
        break;
    case 7:
        printf("打开卧室的灯\n");
        open_light(50001);
        break;
    case 8:
        printf("关闭卧室的灯\n");
        close_light(50001);
        break;
    case 9:
        printf("打开厨房的灯\n");
        open_light(50002);
        break;
    case 10:
        printf("关闭厨房的灯\n");
        close_light(50002);
        break;
    case 11:
        printf("打开走廊的灯\n");
        open_light(50003);
        break;
    case 12:
        printf("关闭走廊的灯\n");
        close_light(50003);
        break;
       
  
    case 19:
        printf("全部打开\n");
        open_light(50000);
        vTaskDelay(50 / portTICK_PERIOD_MS);
        open_light(50001);
        vTaskDelay(50 / portTICK_PERIOD_MS);
        open_light(50002);
        vTaskDelay(50 / portTICK_PERIOD_MS);
        open_light(50003);

        break;
   
    case 20:
        printf("全部关闭\n");
        close_light(50000);
        vTaskDelay(50 / portTICK_PERIOD_MS);
        close_light(50001);
        vTaskDelay(50 / portTICK_PERIOD_MS);
        close_light(50002);
        vTaskDelay(50 / portTICK_PERIOD_MS);
        close_light(50003);
        break;

    default:
        return false;
        break;
    }
    return true;
}

Well, for now, this is how I will introduce you to ESP32-A1S offline voice.

20200223143127752.png (197.6 KB, downloads: 0)

20200223143127752.png
This post is from Domestic Chip Exchange
 
 

Just looking around
Find a datasheet?

EEWorld Datasheet Technical Support

EEWorld
subscription
account

EEWorld
service
account

Automotive
development
circle

Copyright © 2005-2024 EEWORLD.com.cn, Inc. All rights reserved 京B2-20211791 京ICP备10001474号-1 电信业务审批[2006]字第258号函 京公网安备 11010802033920号
快速回复 返回顶部 Return list