Feature

AI Audio Translation Plugin

In a world that’s more connected than ever, effective cross-border communication is paramount. Whether you’re a technical guru seeking a seamless solution or a business visionary looking to break down language barriers, our revolutionary Audio-to-Text Translation API is here to transform how you interact with the world.

The API simplifies the complex process of converting audio in any language into crystal-clear English text. The API Service can be utilized in various real-world use cases to facilitate language translation. Here are some scenarios where the API can be beneficial:

Real-Time Language Translation: A user could speak in one language, and the API could translate the spoken words into another language in real-time. This is particularly useful for conversations between people who speak different languages.
Transcribing Multilingual Podcasts: Podcast creators who conduct interviews or discussions in multiple languages can use the API to automatically transcribe and translate the spoken content, making it accessible to a broader audience.
Language Learning and Pronunciation Improvement: Language learners can use the API to practice speaking in their target language. The API can provide real-time feedback on pronunciation and translate spoken sentences into the target language.
Translating Conference Calls: Businesses with international teams can use the API to translate spoken conversations in conference calls, ensuring all team members understand the discussion.
Voice Assistant Multilingual Support: Voice assistant developers can use the API to make their virtual assistants multilingual, allowing users to interact with the assistant in their preferred language.
Accessible Content for the Deaf and Hard of Hearing: The API can transcribe and translate spoken content into text for individuals who are deaf or hard of hearing, enhancing accessibility to audio content.
Emergency Services and Crisis Communication: Emergency services can use the API to communicate with non-English speakers during crises, providing vital information in the native languages of affected populations.
Travel and Tourism: Travel apps can provide real-time translation for tourists, helping them communicate and navigate in foreign countries.
Customer Support Hotlines: Businesses can offer multilingual customer support hotlines, where the API translates customer queries and support responses in real time.
Legal Interpretation: Legal professionals can use the API to provide interpretation and translation services during legal proceedings, ensuring that all parties understand the proceedings.

These are just a few examples of how the Audio Translation API Plugin can be applied in real-world scenarios to enhance communication, accessibility, and user experience for various industries and applications.

Installation Instructions

Login to Backendless Console and select your app. Open the Marketplace screen, select the API Services section, and install the OpenAI Audio Translation Plugin.
During the installation, you are prompted to enter your OpenAI API key, OpenAI audio model, and an Audio folder name. Enter the required details and click the Save button:
To verify the installation, click the Cloud Code icon in the Backendless Console and confirm that the OpenAIAudioTranslation API service appears in the list of services:
If you need to change any of the configuration settings (API key, OpenAI audio model, or the default folder for the audio files), click the gear icon to access the Service Configuration popup.

Service Method

There is only one method in this service – requestTranslation. This method translates an audio file in any language to English.

Method:

POST

Endpoint URL:

https://xxxx.backendless.app/api/services/OpenAIAudioTranslation/translate

The xxxx.backendless.app is a subdomain assigned to your application. For more information, see the Client-side Setup section of the Backendless documentation.

Request Headers:

Content-Type:application/json

Request Body:

The request body must be a JSON object with the structure shown below:

{
"audioFileUrl": "string",
"model": "string",
"prompt": "string",
"responseFormat": "string",
"temperature": 0,
"saveSource": false
}

Parameters explanation:

audioFileUrl – Required. A URL to an audio file in one of the following formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, or webm.
model – Optional. OpenAI model, for a list of models, refer to https://platform.openai.com/docs/models/overview. Only audio models can be used in this service, whisper-1 is set as default.
prompt – Optional. A text to guide the model’s style or continue a previous audio segment. The prompt should be in English.
responseFormat – Optional. The format of the transcript output, in one of these options: JSON, text, srt, verbose_json, or vtt. JSON by default.
temperature – Optional. The sampling temperature is between 0 and 1. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. If set to 0, the model will use log probability to automatically increase the temperature until certain thresholds are hit.
saveSource – Optional. Default value is false. Set this to true, if you want to save the file referenced in the audioFileUrlparameter in the Backendless file storage of your app. The file will be saved in the folder you specified during the service installation.

Response Body:

Returns translation in specified “responseFormat” value, JSON by default.

Example:

curl -X "POST" "https://xxx.backendless.app/api/services/OpenAIAudioTranslation/translate" \
-H 'Content-Type: application/json' \
-H 'Accept: application/json' \
-d $'{
"audioFileUrl": "https://xxx.backendless.app/api/files/audio/export_ofoct.mp3",
"responseFormat": "json",
"temperature": 0.5,
"saveSource": true
}'

Response:

{
"text": "OpenAI's Translation API can be utilized in various real-world use cases to facilitate language translation."
}

Codeless Reference

AI Audio Transcription Plugin