How to achieve better sentence segmentation results
Select the optimal model
- For videos with the spoken language being Chinese, first choose
ByteDance Speech Large Model Speedy,Qwen-ASR (Local),Alibaba FunASR (Local) + paraformer-zh - For videos with other spoken languages, first choose
openai-whisper (Local)+large-v3model,faster-whisper (Local)model +large-v3model,OpenAI Online Recognition API
- For videos with the spoken language being Chinese, first choose
Set appropriate segmentation parameters:
In
Menu--Tools--Advanced Options--Speech Recognition Parametersarea- Set
Minimum Duration / Millisecondto 1000 (set the minimum subtitle duration in milliseconds) - Set
Maximum Voice Duration Secondsto 3 to 5 (set the maximum subtitle duration in seconds) - Set
Silence Segmentation Duration Millisecondsto a value between 140 and 600 (smaller values lead to finer segmentation, larger values result in longer sentences)
- Set
If the dubbing role on the main interface does not select
clonefor voice cloning dubbing, you can uncheckMerge Short Subtitles to Adjacentin theSpeech Recognition ParametersareaIn
Select VAD, the defaultten-vadsentence segmentation model is used; you can try switching to thesileromodel and adjust it accordingly in theSpeech Recognition Parametersarea
Second recognition: If dubbing is selected, you can check
Second Recognitionin the top right corner of the main interface. This will perform speech transcription on the dubbed audio again, generating shorter subtitles; the duration automatically applies half of the set values forMinimum Duration / MillisecondandMaximum Voice Duration SecondsSelect
Noise ReductionorSeparate Voice and Background Sound: If the audio background is not clean, you can checkNoise Reduction(very slow) in the top right corner of the main interface orSeparate Voice and BackgroundunderSet More Parameters(if both are selected simultaneously, onlySeparate Voice and Backgroundwill be executed)If using
faster-whisper (Local), you can also try uncheckingPre-segment audio with Whisper?inMenu--Tools--Advanced Options--Speech Recognition Parametersarea. This may improve sentence segmentation but could also generate longer subtitlesTranslate only one video at a time; after speech recognition is complete, an editing box will pop up, allowing you to adjust the recognized subtitle results
