Skip to content

Google Gemini is usually used for chatting, translation, and asking for various kinds of help, but you might not know that it can also be used for speech recognition, such as converting audio or video into subtitle files, and the effect is quite good.

Google Gemini online free usage address: https://aistudio.google.com/app/prompts

As you know, you need a proxy to access this address. Even if some countries' IPs can open it, they may not be within Gemini's access range. If it prompts that your current country is not in the service area, please switch to another country's node.

Using in Browser

As shown in the figure below, first add a prompt to add some restrictions to the output results, such as requiring the return of an SRT subtitle format, the maximum duration of each subtitle, converting traditional Chinese to simplified Chinese, or even requiring the transcription results to be translated into other languages simultaneously. Then upload the audio or video. Note that it should not be too large, otherwise it may exceed the token limit and fail.

image.png

Then click Run, and the results will be available quickly. The efficiency is quite high.

image.png

The effect is even better than the whisper-large-v2 model, making it worth using.

Using in Video Translation and Dubbing Software

Please upgrade to the v3.07 patch package version first.

  1. First, in the Menu Bar -- Translation Settings -- Gemini pro, fill in your Key, the model used, and you can modify the prompt during transcription here.

image.png

  1. Don't forget your proxy/VPN, otherwise errors will inevitably occur.

image.png

  1. In the speech recognition channel, select Gemini Large Model Recognition, upload audio and video, select the pronunciation language, and do not select Recut Chinese sentences. Gemini's own sentence segmentation effect is good, and selecting it may result in worse results.

image.png

  1. Wait for the recognition results. If you are not satisfied, you can adjust the prompt and modify it again.

image.png