Skip to content

Advanced Settings Options Explained

In the top menu -- Tools/Options--Advanced Options, you can customize some parameters for finer control. As shown below.

image-20240804220459698

Click the text title on the left to expand the detailed description.

Interface Language: Sets the software interface language. Requires a restart after modification. Defaults to following the operating system. zh represents Chinese, en represents English.

Pause Countdown: When processing single video translation, pauses for a period of time after subtitle recognition and after subtitle translation. The pause duration in seconds can be set here.

Background Volume Multiplier: The background audio volume is a multiple of the original volume. For example, entering 0.8 reduces the volume to 80% of the original.

Loop Background Audio: Whether to loop the background audio if its duration is shorter than the video. true for looping, false for not looping.

302.ai Translation Model List: Enter the names of the 302.ai models used for translation, separated by English commas.

302.ai TTS Model List: Enter the names of the 302.ai models used for voice synthesis, separated by English commas.

ChatGPT Model List: Selectable ChatGPT models, separated by English commas.

Gemini Model List: Gemini model list, separated by English commas.

Azure Model List: Selectable models, separated by English commas.

Local LLM Model List: Selectable models, separated by English commas.

ByteDance Volcano Inference Endpoint: Enter the name of the inference endpoint created in ByteDance Volcano Ark. See https://pyvideotrans.com/zijiehuoshan for creation instructions.

Video Transcoding Loss Control: Video transcoding loss control. 0 = minimum loss, 51 = maximum loss, default 13.

NVIDIA Use qp instead of crf: Whether to use qp instead of crf to control video quality loss on NVIDIA graphics cards. true=yes, false=no.

Output Video Quality Control: Controls the output video quality and size. Faster means lower quality.

Custom ffmpeg Command Parameters: Custom ffmpeg command parameters. Will be added in the second to last position, for example -bf 7 -b_ref_mode middle

264 or 265 Video Encoding: Enter 264 to use libx264 encoding, enter 265 to use libx265 encoding. 264 has better compatibility, 265 has a higher compression ratio and higher clarity.

Maximum Audio Speed Multiplier: Maximum audio speed multiplier, default 3, i.e., maximum acceleration to 3 times the original speed. Requires a number between 1-100, e.g., 3 represents a maximum 3x speed increase. Used to control the alignment of dubbed audio length with the original length.

Video Slow-Motion Multiplier: Video slow-motion multiplier: A number greater than 1 represents the maximum allowable slow-motion multiplier. 0 or 1 means no video slow-motion is performed. Used to extend the video to align with the dubbing and subtitles.

Remove Silence at the End of Dubbing: Whether to remove silent blanks at the end of the dubbing. true=remove, false=don't remove.

Remove Subtitles Longer Than Dubbing: Whether to remove silence where the original subtitle duration is longer than the dubbing duration. For example, if the original duration is 5s and the dubbing is 3s, whether to remove the 2s of silence. true=remove, false=don't remove.

Remove Silence Between Two Subtitles: Remove the silent duration in ms between two subtitles. For example, 100ms. If the interval between two subtitles is greater than 100ms, 100ms will be removed. -1 = completely remove.

Force Subtitle Timeline Modification: true=force subtitle timeline modification to match the audio, false=don't modify, keep the original subtitle timeline. Not modifying may lead to mismatched subtitles and audio.

Enable VAD: Enable VAD when using the faster-whisper subtitle overall recognition mode. true=enable, false=disable. Enabled by default.

Minimum Silence Segment: Minimum silence segment in ms, default 250ms.

Maximum Sentence Duration: Maximum sentence duration in seconds, default 6s.

VAD Threshold: VAD threshold.

VAD Pad Value: VAD pad value.

Silence Segment in Even Splitting: Silence segment in even splitting mode, default 10s.

Segment Duration in Even Splitting: Segment duration in seconds in even splitting mode.

Faster and OpenAI Model List: Model name list for faster mode and openai mode, separated by English commas.

CUDA Data Type: CUDA data type in faster mode. int8=low resource consumption, fast speed, low accuracy, float32=high resource consumption, slow speed, high accuracy, int8_float16=device auto-select.

Whisper Model Prompt: Prompt sent to the whisper model.

Faster-Whisper CPU Processes: Number of CPU processes for subtitle recognition in faster mode.

Faster-Whisper Worker Processes: Number of worker processes running concurrently for subtitle recognition in faster mode.

Subtitle Recognition Accuracy Control 1: Subtitle recognition accuracy adjustment, 1-5, 1=lowest VRAM consumption, 5=highest VRAM consumption.

Subtitle Recognition Accuracy Control 2: Subtitle recognition accuracy adjustment, 1-5, 1=lowest VRAM consumption, 5=highest VRAM consumption.

Faster-Whisper Temperature Control: 0=less GPU resource usage but slightly worse results, 1=more GPU resource usage and better results.

Context Awareness: true=more GPU usage, better results, false=less GPU usage, slightly worse results.

Hard Subtitle Font Pixel: Hard subtitle font pixel size.

Hard Subtitle Font Name: Hard subtitle font name.

Hard Subtitle Text Color: Sets the font color. Note the 6 characters after &H, each 2 letters represent the BGR color, i.e., 2 bits blue/2 bits green/2 bits red, which is the reverse of the common RGB color order.

Hard Subtitle Text Border Color: Sets the font border color. Note the 6 characters after &H, each 2 letters represent the BGR color, i.e., 2 bits blue/2 bits green/2 bits red, which is the reverse of the common RGB color order.

Hard Subtitle Vertical Shift: Subtitles are located at the bottom of the video by default. A value greater than 0 can be set here, representing how much the subtitle is moved upward. Note that the maximum value cannot be greater than (video height - 20), i.e., at least 20 height must be reserved for displaying subtitles, otherwise the subtitles will be invisible.

Faster/OpenAI-Whisper Resentencing After Recognition: If selected, the recognized text will be resentenced using nltk after recognition.

Number of Characters Per Line for CJK Languages: Number of characters per line for CJK hard subtitles. Lines will wrap if this number is exceeded. Default is 20 characters. This is also used as a basis for resentencing.

Number of Characters Per Line for Other Languages: Line wrapping length for other language hard subtitles. Lines will wrap if this number of characters is exceeded. Default is 54 characters. This is also used as a basis for resentencing.

Convert Traditional Chinese Subtitles to Simplified Chinese: Force the conversion of recognized traditional Chinese subtitles to simplified Chinese.

Number of Subtitles Translated Simultaneously: Number of subtitle lines translated simultaneously, default 15.

Number of Retries for Translation Errors: Number of retries for translation errors, default 2.

Pause Time After Translation: Pause time/second after each translation, used to limit request frequency.

Number of Subtitles Dubbed Simultaneously: Number of subtitle lines dubbed simultaneously.

AzureTTS Batch Line Count: Number of lines dubbed at once by azureTTS, default 150.

ChatTTS Voice Tone Value: chatTTS voice tone value.