Parakeet-API: High-Performance Local Speech Transcription Service
parakeet-api project is a local speech transcription service based on the NVIDIA Parakeet-tdt-0.6b model. It provides an interface compatible with the OpenAI API and a simple Web user interface, allowing you to easily and quickly convert any audio or video files into high-precision SRT subtitles, and can be adapted for pyVideoTrans v3.72+.
Project Open Source Address: https://github.com/jianchang512/parakeet-api
Windows Integration Package Download:
Integration Package Download Link 1: Download from Baidu Netdisk
Integration Package Download Link 2: Download from HuggingFace.co
How to use: After unzipping, double-click 启动.bat, wait for the following interface to appear, and the browser will open automatically, indicating successful startup. Successful Startup Interface

Usage in pyVideoTrans
Parakeet-API can be seamlessly integrated with the video translation tool pyVideoTrans (versions v3.72 and above).

- Ensure your
parakeet-apiservice is running locally. - Open the
pyVideoTranssoftware. - In the menu bar, select Speech Recognition(R) -> Nvidia parakeet-tdt.
- In the pop-up configuration window, set the "http address" to:
http://127.0.0.1:5092/v1 - Click "Save" to start using it.
Source Code Deployment Guide
🛠️ Installation and Configuration Guide
This project supports Windows, macOS, and Linux. Please follow the steps below for installation and configuration.
Step 0: Configure Python 3.10 Environment
If you don't have Python 3 installed on your machine, please follow this tutorial: https://pvt9.com/_posts/pythoninstall
Step 1: Prepare FFmpeg
This project uses ffmpeg for audio and video format preprocessing.
Windows (Recommended):
- Download from FFmpeg Github Repository and unzip to get
ffmpeg.exe. - Place the downloaded
ffmpeg.exefile directly in the project root directory (at the same level as theapp.pyfile). The program will automatically detect and use it without needing to configure environment variables.
- Download from FFmpeg Github Repository and unzip to get
macOS (Using Homebrew):
bashbrew install ffmpegLinux (Debian/Ubuntu):
bashsudo apt update && sudo apt install ffmpeg
Step 2: Create Python Virtual Environment and Install Dependencies
Download or clone this project code to your local computer (recommended to place it in a folder with English or numeric names on a non-system drive).
Open a terminal or command prompt and navigate to the project root directory (on Windows, you can type
cmdin the folder address bar and press Enter).
Create virtual environment:
python -m venv venvActivate the virtual environment:
- Windows (CMD/PowerShell):
.\venv\Scripts\activate - macOS / Linux (Bash/Zsh):
source venv/bin/activate
- Windows (CMD/PowerShell):
Install dependency libraries:
If you do not have an NVIDIA graphics card (CPU only):
bashpip install -r requirements.txtIf you have an NVIDIA graphics card (GPU acceleration): a. Ensure you have installed the latest NVIDIA driver and the corresponding CUDA Toolkit. b. Uninstall any potentially old PyTorch version:
pip uninstall -y torchc. Install PyTorch matching your CUDA version (using CUDA 12.6 as an example):bashpip install torch --index-url https://download.pytorch.org/whl/cu126
Step 3: Start the Service
In the terminal with the activated virtual environment, run the following command:
python app.pyYou will see the service start-up prompts. The first run will download the model (approximately 1.2GB). Please be patient. 
If a lot of prompts appear, don't worry. 
Successful Startup Interface

🚀 Usage Guide
Method 1: Using the Web Interface
- Open in your browser: http://127.0.0.1:5092
- Drag and drop or click to upload your audio/video file.
- Click "Start Transcription", wait for the process to complete, then view and download the SRT subtitles below.

Method 2: API Call (Python Example)
You can easily call this service using the openai library.
from openai import OpenAI
client = OpenAI(
base_url="http://127.0.0.1:5092/v1",
api_key="any-key",
)
with open("your_audio.mp3", "rb") as audio_file:
srt_result = client.audio.transcriptions.create(
model="parakeet",
file=audio_file,
response_format="srt"
)
print(srt_result)