Skip to content

Open Source Video Translation

subtitle generation + translation + dubbing = video with subtitles and dubbing

Simple and Easy to Use

Our goal is to provide users with an easy-to-use and ready-to-use video translation tool, especially suitable for technical novices or users with limited technical knowledge. To this end, the software simplifies functionality as much as possible. For example, although Whisperx is stronger in speech recognition and role differentiation, we chose an easier-to-use solution due to its complex installation and error-proneness, ensuring users can easily get started.

Versatile Functionality

In addition to supporting video translation between multiple languages, the software also integrates speech transcription, text-to-speech, and subtitle translation functions. For users who only need speech transcription or dubbing, these functions can be used separately without the need for complete video translation, avoiding the trouble of downloading additional software.

Cross-Platform Support

The software supports multiple platforms. Windows users can download and extract it directly, while MacOS and Linux users can easily and quickly start it by installing the source code with one click.

Rich Third-Party Interface Support

Video translation is divided into three stages: speech recognition, subtitle translation, and text-to-speech. The software supports multiple third-party interfaces at each stage.

For example, in the speech recognition stage, you can choose to use faster-whisper or openai-whisper, or you can choose online API interfaces or self-built speech recognition services;

The subtitle translation stage supports Google Translate, ChatGPT, or local large models.

The dubbing stage is also flexible. Users can choose the default edgetts dubbing or integrate other APIs, such as OpenAI, ElevenLabs, Azure, etc.

Each stage can support the use of your own API interface, if you have your own developed API service.

Highly Customizable

The software provides dozens of customization options. Users can adjust translation channels, dubbing methods, speech recognition engines, tone, speed, subtitle styles (font, color, size), video output quality, etc., according to their needs. It supports controlling the concurrency of translation and dubbing tasks to achieve a highly personalized translation experience.

Supports Completely Offline Use

If you need offline processing, the software supports completely local operation. Speech recognition can use faster-whisper or openai-whisper, and dubbing can be done through Clone-voice or GPT-SoVITS, etc., without needing an internet connection to complete all operations.

Flexible Combination of Free and Commercial APIs

The software provides a completely free solution by default, and all core functions do not require any fees. Speech recognition, translation, and dubbing all have free options, such as faster-whisper and Edge-TTS. For users with higher requirements, the software also supports third-party commercial APIs, such as ChatGPT, Azure, and other advanced speech synthesis services, providing higher-quality translation and dubbing.

API Integration Support

The software provides convenient API call functions, making it easy for developers to integrate it into other tools or processes.

Comprehensive Documentation and Support Community

We provide complete usage tutorials and reference documentation. Users can learn how to use the software through the documentation site (https://pyvideotrans.com). In addition, if you encounter technical problems, you can ask questions on the community forum (https://bbs.pyvideotrans.com) to get help.