[DigiKey "Smart Manufacturing, Non-stop Happiness" Creative Competition] 7. TTS function implementation

顺竿爬 · Published on 2023-12-21 11:02

[DigiKey "Smart Manufacturing, Non-stop Happiness" Creative Competition] 7. TTS function implementation [Copy link]

The implementation of the TTS function is based on the edge-tts library, which is implemented based on the crawler method.

First, install the library:

```

pip3 install edge-tts

```

To list all supported language roles, use the following command:

···

edge-tts --list-voices

···

The test code is as follows:

```

#!/usr/bin/env python3

import edge_tts

import pydub

import io



async def tts(text, actor = "zh-CN-XiaoyiNeural", fmt = "mp3"):

 _voices = await edge_tts.VoicesManager.create()

 _voices = _voices.find(ShortName=actor)

 _communicate = edge_tts.Communicate(text, _voices[0]["Name"])

 _out = bytes()

 async for _chunk in _communicate.stream():

 if _chunk["type"] == "audio":

 # print(chunk["data"])

 _out += _chunk["data"]

 elif _chunk["type"] == "WordBoundary":

 # print(f"WordBoundary: {chunk}")

 pass

 if fmt == "mp3":

 return _out

 if fmt == "wav":

 _raw = pydub.AudioSegment.from_file(io.BytesIO(_out))

 _raw = _raw.set_frame_rate(16000)

 _wav = io.BytesIO()

 _raw.export(_wav, format="wav")

 # for i in range(len(_wav.getvalue())-1,-1,-1):

 # if _wav.getvalue()[i] != 0x00:

 # break

 return _wav.getvalue()#[:i+1]



if __name__ == "__main__":

 import asyncio

 import pydub.playback

 while True:

 text_in = input(">说点什么：")

 raw_wav = asyncio.run(tts(text_in, actor = "zh-CN-XiaoyiNeural", fmt = "wav"))

 wav = pydub.AudioSegment.from_file(io.BytesIO(raw_wav))

 pydub.playback. _play_with_pyaudio (wav)

```

Here we use the pyaudio method to specify the playback device, because we want to use I2S HAT for playback instead of the default device. First, use aplay -l to query the device number, then modify the source code of the library, add output_device_index=1 in line 26 of site-packages/pydub/playback.py, and the complete function is as follows:

···

def _play_with_pyaudio(seg):

import pyaudio

p = pyaudio.PyAudio()

stream = p.open(format=p.get_format_from_width(seg.sample_width),

channels=seg.channels,

rate=seg.frame_rate,

output_device_index=1,

output=True)

# Just in case there were any exceptions/interrupts, we release the resource

# So as not to raise OSError: Device Unavailable should play() be used again

try:

# break audio into half-second chunks (to allows keyboard interrupts)

for chunk in make_chunks(seg, 500):

stream.write(chunk._data)

finally:

stream.stop_stream()

stream.close()

p.terminate()

···

genvex · Published on 2023-12-21 20:53

[DigiKey "Smart Manufacturing, Non-stop Happiness" Creative Competition] 7. TTS function implementation [Copy link]

Latest reply