Custom Speech Recognition API
Starting from version 3.56, the Gladia speech recognition service is supported in this custom speech recognition channel. Please refer to this tutorial for specific usage instructions.
If you are not satisfied with the existing speech recognition methods, you can also customize your own speech recognition API. Fill in the relevant information in the menu: Speech Recognition Settings -> Custom Speech Recognition API.

Fill in your API address, starting with http. The system will send a WAV format audio file with the key name audio to your provided API address. The audio data has a sample rate of 16k and 1 channel. If your API requires key authentication, enter the relevant password in the key field. This password will be appended to the API address and sent as sk=password.
requests.post(api_url, files={"audio": open(audio_file, 'rb')})
Your API needs to return data in JSON format. When it fails, set code to 1 and msg to the reason for the failure.
Return on failure:
res={
"code":1,
"msg":"Error reason"
}Return on success:
res={
"code":0,
"data":[
{
"text":"Subtitle text",
"time":'00:00:01,000 --> 00:00:06,500'
},
{
"text":"Subtitle text",
"time":'00:00:06,900 --> 00:00:12,200'
},
... multiple entries
]
}As shown below, if a secret key password value is filled in, it will be appended to the api_url and sent: api_url?sk=the_filled_sk_value.
requests.post(api_url, files={"audio": open(audio_file, 'rb')})
# Return on failure:
res={
"code":1,
"msg":"Error reason"
}
# Return on success:
res={
"code":0,
"data":[
{
"text":"Subtitle text",
"time":'00:00:01,000 --> 00:00:06,500'
},
{
"text":"Subtitle text",
"time":'00:00:06,900 --> 00:00:12,200'
},
]
}