Supports 2 Forms of Qwen3-TTS
Qwen-TTS is an advanced speech synthesis technology capable of converting text into highly realistic and natural human voices. Its key highlight is the ability to automatically adjust the rhythm and emotion of the speech based on the text content.
- One is the Aliyun Bailian Online API.
- The other is the software's built-in local Qwen3-TTS.
One: Qwen3-TTS Aliyun Bailian API (Online Version)
The qwen3-tts model supports 10 languages and multiple Chinese dialects. Model name:
qwen3-tts-flashClick here to view qwen3-tts specific voices and supported languages
{
"芊悦(Cherry)": "Cherry",
"苏瑶(Serena)": "Serena",
"晨煦(Ethan)": "Ethan",
"千雪(Chelsie)": "Chelsie",
"茉兔(Momo)": "Momo",
"十三(Vivian)": "Vivian",
"月白(Moon)": "Moon",
"四月(Maia)": "Maia",
"凯(Kai)": "Kai",
"不吃鱼(Nofish)": "Nofish",
"萌宝(Bella)": "Bella",
"詹妮弗(Jennifer)": "Jennifer",
"甜茶(Ryan)": "Ryan",
"卡捷琳娜(Katerina)": "Katerina",
"艾登(Aiden)": "Aiden",
"沧明子(Eldric Sage)": "Eldric Sage",
"乖小妹(Mia)": "Mia",
"沙小弥(Mochi)": "Mochi",
"燕铮莺(Bellona)": "Bellona",
"田叔(Vincent)": "Vincent",
"萌小姬(Bunny)": "Bunny",
"阿闻(Neil)": "Neil",
"墨讲师(Elias)": "Elias",
"徐大爷(Arthur)": "Arthur",
"邻家妹妹(Nini)": "Nini",
"诡婆婆(Ebona)": "Ebona",
"小婉(Seren)": "Seren",
"顽屁小孩(Pip)": "Pip",
"少女阿月(Stella)": "Stella",
"博德加(Bodega)": "Bodega",
"索尼莎(Sonrisa)": "Sonrisa",
"阿列克(Alek)": "Alek",
"多尔切(Dolce)": "Dolce",
"素熙(Sohee)": "Sohee",
"小野杏(Ono Anna)": "Ono Anna",
"莱恩(Lenn)": "Lenn",
"埃米尔安(Emilien)": "Emilien",
"安德雷(Andre)": "Andre",
"拉迪奥·戈尔(Radio Gol)": "Radio Gol",
"上海-阿珍(Jada)": "Jada",
"北京-晓东(Dylan)": "Dylan",
"南京-老李(Li)": "Li",
"陕西-秦川(Marcus)": "Marcus",
"闽南-阿杰(Roy)": "Roy",
"天津-李彼得(Peter)": "Peter",
"四川-晴儿(Sunny)": "Sunny",
"四川-程川(Eric)": "Eric",
"粤语-阿强(Rocky)": "Rocky",
"粤语-阿清(Kiki)": "Kiki"
}Step 1: Get and Configure Your API KEY
- Please click this link to visit the Aliyun Bailian platform: https://bailian.console.aliyun.com/?tab=model#/api-key

Log in to your Aliyun account (if you don't have one, simply register following the prompts).
On the API-KEY management page, click "Create API-KEY" (创建API-KEY). The system will automatically generate a string starting with "sk-". This is your API KEY. Please copy this string.
Return to the pyVideoTrans software. Find TTS Settings in the top menu bar, click it, and select Qwen TTS from the dropdown menu.

In the Qwen3 TTS configuration window that pops up, paste the API KEY you just copied into the "API KEY" input box. You can click the "Test" button to audition the effect. If you hear sound, the configuration is successful. Finally, click Save.

Step 2: Use Qwen3-TTS in Video Translation
Once configured, you can enable Qwen3-TTS when processing individual videos.
- On the main interface of pyVideoTrans, find the "Voiceover Channel" (配音渠道) dropdown menu, click it, and select "Qwen3 TTS".
- In the adjacent "Voice Role" (配音角色) menu, you can select your preferred voice. For example, choose "Cherry" to experience a standard female voice, or "Sunny" for an interesting Sichuan dialect voiceover.

Step 3: Use in Batch Dubbing and Multi-Role Dubbing
Qwen-TTS's powerful features also apply to batch processing tasks, greatly improving your work efficiency.
- Batch Dubbing for Subtitles: If you have multiple SRT subtitle files that need dubbing, you can switch to the "Batch Dubbing for Subtitles" interface. Similarly, select "Qwen TTS" in the "Voiceover Channel" at the bottom and choose your desired role.
- Subtitle Multi-Role Dubbing: This feature is equally applicable when handling dialogues with multiple characters. You can assign different Qwen-TTS voices to different characters in the "Subtitle Multi-Role Dubbing" area.

Two: Qwen3-TTS Local Built-in (Offline Version)
Please ensure you have upgraded to version 3.97+. The built-in version is ready to use and utilizes the fixed 1.7B size model.
The model will be downloaded automatically upon first use. Of course, you can also download it manually. The manual download method is as follows:
Open the
modelsfolder in the software directory and create 2 new folders:models--Qwen--Qwen3-TTS-12Hz-1.7B-Baseandmodels--Qwen--Qwen3-TTS-12Hz-1.7B-CustomVoice.First, open this URL: https://huggingface.co/Qwen/Qwen3-TTS-12Hz-1.7B-Base/tree/main. Download all files and folders, then place them inside the
models/models--Qwen--Qwen3-TTS-12Hz-1.7B-Basefolder.Then, open this URL: https://huggingface.co/Qwen/Qwen3-TTS-12Hz-1.7B-CustomVoice/tree/main. Similarly, download all files and folders into the
models/models--Qwen--Qwen3-TTS-12Hz-1.7B-CustomVoicefolder.As shown in the images below:
Reference Audio:
Suitable for cloning a voice based on a 3-10 second segment of reference audio.
Go to Menu -- Tools -- TTS Settings -- Qwen-tts(Local). Fill in the reference audio filename and the corresponding text content of that audio. Use one set per line. You can then choose to use this reference audio for cloning in the "Voice Role" selection.
Example:
n10.wav#你说四大皆空,却为何紧闭双眼,你若挣开眼睛看看我,我不相信,你两眼空空Place the n10.wav audio file in the f5-tts folder within the software directory, then fill in the text corresponding to what is spoken in the audio after the # symbol.


Voice Style Prompt
When using the preset voices Vivian, Uncle_fu, or Sohee built into the Qwen-TTS model, you can provide a guidance prompt.
Open Menu -- Tools -- TTS Settings -- Qwen-tts(Local). In the prompt text box, enter a short prompt, such as use an angry and crazy tone. This prompt will be automatically applied when using the built-in voices.



