Translating text to AI Voice Speech

Author: Axel Fernández Curros

Guide to transform text to AI voice audio. To continue the work of the whisper audio-to-text and text-to-text translate.py script from the previous post we create a script to convert any given text to an audio file with our own voice tone from a given 10 second audio .wav file as input. The script has been modified to accept large amount of phrases so it would be possible to use a large amount of text and would not affect the audio quality.

Record your own voice

The clip should have about 8-10 seconds in a good quality. Audacity can be used to record and export the voice. Remind to export it as .wav format.

Install python dependencies

Linux

pip3 install gitpython
pip3 install gdown

Windows (chocolatey)

pip install gitpython
pip install gdown

Clone repository

git clone https://github.com/Axlfc/UE5-python

Navigate into the script folder from the repository directory

cd UE5-python/Content/Python

Run the python script

Linux

python3 voice_cloning.py "Whatever I write here is going to be read with the same voice" audio.wav

Windows

python voice_cloning.py "Whatever I write here is going to be read with the same voice" audio.wav

Translating speech to text

Generating text with AI