How to Use Original Voice Dubbing
In dubbing operations, we usually choose a fixed voice, such as "yunxi", "xiaoyi", or "解说小帅", and use only that voice throughout the dubbing. However, for scenes with multiple speakers, using a single voice may not be ideal. A better effect is to have each speaker correspond to a specific voice, preferably consistent with the voice of the speaker in the original video. For example, if Zhu Bajie (Pigsy) is speaking in the original video, and the translated English version still maintains Zhu Bajie's voice, then the original voice cloning function needs to be used.
Currently, the software supports three dubbing channels to achieve original voice cloning: clone-voice, CosyVoice, and F5-TTS.
Principle: When dubbing a segment (e.g., 00:00:03 --> 00:00:08), the original audio of that segment will be clipped first, and the original text and translated target text corresponding to the audio will be obtained. Then this data is sent to the dubbing channel, which generates the dubbing of the target text by referencing the tone of the original audio.
Using the clone-voice Dubbing Channel
You need to install the https://github.com/jianchang512/clone-voice project. After opening the project homepage, carefully read the instructions. You can deploy the clone-voice project using the source code. For Windows systems, you can also directly find Releases (https://github.com/jianchang512/clone-voice/releases) in the middle of the right side, download the integrated package directly, double-click app.exe
after downloading and decompressing to start it.
After the successful launch is displayed, fill in the default API address http://127.0.0.1:9988
into the HTTP address in the video translation software Menu--TTS Settings--Original Voice Cloning clone-voice. After testing without problems, you can start using it.
Using the CosyVoice Dubbing Channel
You also need to install the CosyVoice project. Installation instructions can be found at https://pyvideotrans.com/cosyvoice.html
You can also use third-party integrated packages, but third-party integrated packages do not support cloning voices and can only specify fixed audio.
After installation according to the instructions, download the api.py
file from this address https://github.com/jianchang512/cosyvoice-api/blob/main/api.py and place it in the CosyVoice project, in the same directory as the webui.py file.
Then start api.py, and fill in the API address into the API address of the video translation software Menu--TTS Settings-CosyVoice, the default address is http://127.0.0.1:9233
Using the F5-TTS Dubbing Channel
You need to install the F5-TTS project. Detailed installation instructions can be found at https://pyvideotrans.com/f5tts.html
You can install from source code, or use the integrated package for Windows. After installation, double-click run-api.bat to start the API service, and then fill in the default address http://127.0.0.1:5010
into the video translation software Menu-TTS Settings-F5-TTS API address.
Select "clone" for character selection in the main interface to perform clone voice dubbing.
Note that in addition to clone-voice supporting more than ten languages, F5-TTS and CosyVoice only support Chinese and English language cloning.