Transcriptions
Transcribe audio files to text using the Sansa API
Converts an audio file to text. Compatible with the OpenAI audio.transcriptions SDK client. Sansa routes the file to a transcription-optimized model automatically.
POST /v1/audio/transcriptions
The request body must be multipart/form-data. See the code panel for examples in Python, TypeScript, and curl.
Form Parameters
file
Required. The audio file to transcribe. Must be multipart/form-data.
Supported formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, webm.
Maximum file size: 25 MB.
model
Optional. Any value is accepted — Sansa ignores it and always routes to the appropriate transcription model. Pass "whisper-1" or "gpt-4o-transcribe" for OpenAI SDK compatibility.
language
Optional. ISO-639-1 language code (e.g., "en", "fr", "de"). Providing the language improves accuracy and speed. If omitted, the model detects the language automatically.
prompt
Optional. A text string to guide the model's style or provide context for the audio. Useful for supplying proper nouns, acronyms, or domain-specific vocabulary that the model might otherwise transcribe incorrectly.
temperature
Default: 0.0. Sampling temperature between 0.0 and 1.0. Lower values produce more deterministic output.
response_format
Default: "json". Controls the output format.
| Value | Description |
|---|---|
"json" | Returns {"text": "..."} |
"text" | Returns the transcription as a plain string |
Only "json" and "text" are supported. Other values return 400 with code unsupported_response_format.
Returns
JSON (default)
{
"text": "Hello, world. This is a transcription of the audio."
}Text
A plain string with the transcription. The Content-Type header will be text/plain.
Errors
| HTTP Status | Code | When |
|---|---|---|
400 | invalid_request | File missing, filename missing, or file is empty |
400 | invalid_file_format | File extension not in supported formats |
400 | file_too_large | File exceeds 25 MB |
400 | unsupported_response_format | response_format is not "json" or "text" |
401 | unauthorized | Invalid or missing API key |
402 | insufficient_credits | Account balance is zero or negative |
429 | rate_limit_exceeded | Too many requests |
500 | internal_error | Unexpected server error |
Billing
- Cost is estimated before the request and reserved from your balance.
- After the response completes, cost is recalculated using actual token usage and the difference is refunded or deducted.
- Failed requests are not charged.
Audio in chat completions
For conversational use cases — asking the model questions about audio content, or sending audio as part of a multi-turn conversation — use input_audio content parts in /v1/chat/completions instead. See the Completions docs for the ContentPart interface.