In the rapidly evolving world of digital content and communication, seamless integration between audio and text is becoming increasingly paramount. Recognizing this demand, we’re offering the AI-powered Audio Transcription API. This advanced service leverages state-of-the-art machine-learning techniques to generate a prompt-driven transcription of audio files, turning spoken words into readable text, and unlocking a host of new possibilities for businesses and developers alike.
These are just a few examples of how the Audio Transcription API Plugin can be applied in real-world scenarios to enhance communication, accessibility, and user experience for various industries and applications.
Our API is designed with ease of integration in mind, ensuring developers can effortlessly incorporate it into their applications. It supports various audio formats and languages, ensuring versatility and global applicability. We encourage developers and businesses to tap into this transformative tool, ushering in an era of enhanced communication and content creation. Whether you’re a software engineer aiming to integrate this functionality into your apps or a business owner looking to leverage AI-powered transcription for operational improvements, our API is here to serve your needs.
OpenAI API key
, OpenAI audio model
, and Audio folder
. Enter the required details and click the Save button:There is only one method in this service – requestTranscription
. This method transcribes an audio file in any language to English.
Method:
POST
Endpoint URL:
https://xxxx.backendless.app/api/services/OpenAIAudioTranscription/transcribe
The xxxx.backendless.app
is a subdomain assigned to your application. For more information, see the Client-side Setup section of the Backendless documentation.
Request Headers:
Content-Type:application/json
Request Body:
The request body must be a JSON object with the structure shown below:
{ "audioFileUrl": "string", "model": "string", "prompt": "string", "language": "string", "responseFormat": "string", "temperature": 0, "saveSource": false }
Parameters explanation:
audioFileUrl
– Required. Url to an audio file in one of these formats: flac
, mp3
, mp4
, mpeg
, mpga
, m4a
, ogg
, wav
, or webm
.model
– Optional. Name of an OpenAI model. For the list of available models refer to https://platform.openai.com/docs/models/overview. Only audio models can be used in this service, whisper-1
is set as the default.prompt
– Optional. A text to guide the model’s style or continue a previous audio segment. The prompt should be in English. language
– Optional. The language of the input audio. Supplying the input language in ISO-639-1 format will improve accuracy and latency.responseFormat
– Optional. The format of the transcript output, in one of these options: JSON
, text
, srt
, verbose_json
, or vtt
. JSON
is used by default.temperature
– Optional. The sampling temperature. The value must be between 0 and 1. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. If set to 0, the model will use log probability to automatically increase the temperature until certain thresholds are hit.saveSource
– Optional. Defaults to false
. If the parameter is to true
, the service will save the input source file from the audioFileUrl
parameter in the Backendless file storage. The file will be saved in a folder specified during the service installation.Response Body:
Returns translation in specified “responseFormat
” value, JSON by default.
Example:
curl -X "POST" "https://xxx.backendless.app/api/services/OpenAIAudioTranscription/transcribe" \ -H 'Content-Type: application/json' \ -H 'Accept: application/json' \ -d $'{ "audioFileUrl": "https://xxx.backendless.app/api/files/audio/export_111.mp3", "responseFormat": "json", "temperature": 0.5, "saveSource": false }'
Response:
{
"text": "OpenAI's Transcription API can be valuable in various use case scenarios for converting spoken language into written text."
}