Custom Speech Recognition API
If you are not satisfied with the existing speech recognition methods, you can also customize your own speech recognition API. Simply fill in the relevant information in Menu - Speech Recognition Settings - Custom Speech Recognition API.
Fill in your API address, starting with "http". A WAV format audio data with the key name "audio", a sample rate of 16k, and a single channel will be sent to the API address you provide. If your API requires key verification, fill in the relevant password in the key box. This password will be appended to the API address as sk=password
.
requests.post(api_url, files={"audio": open(audio_file, 'rb')})
Your API needs to return JSON formatted data. In case of failure, set code
to 1 and msg
to the reason for the failure.
Failure response:
res={
"code":1,
"msg":"Error reason"
}
Success response:
res={
"code":0,
"data":[
{
"text":"Subtitle text",
"time":'00:00:01,000 --> 00:00:06,500'
},
{
"text":"Subtitle text",
"time":'00:00:06,900 --> 00:00:12,200'
},
...multiple
]
}
As follows
If you have filled in the key password value, it will be appended to the api_url before sending, api_url?sk=the filled sk value
requests.post(api_url, files={"audio": open(audio_file, 'rb')})
Failure response:
res={
"code":1,
"msg":"Error reason"
}
Success response:
res={
"code":0,
"data":[
{
"text":"Subtitle text",
"time":'00:00:01,000 --> 00:00:06,500'
},
{
"text":"Subtitle text",
"time":'00:00:06,900 --> 00:00:12,200'
},
]
}