Transcriptions

Transcribe audio files to text using the Sansa API

Converts an audio file to text. Compatible with the OpenAI audio.transcriptions SDK client. Sansa routes the file to a transcription-optimized model automatically.

POST /v1/audio/transcriptions

The request body must be multipart/form-data. See the code panel for examples in Python, TypeScript, and curl.


Form Parameters

file

Required. The audio file to transcribe. Must be multipart/form-data.

Supported formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, webm.

Maximum file size: 25 MB.

model

Optional. Any value is accepted — Sansa ignores it and always routes to the appropriate transcription model. Pass "whisper-1" or "gpt-4o-transcribe" for OpenAI SDK compatibility.

language

Optional. ISO-639-1 language code (e.g., "en", "fr", "de"). Providing the language improves accuracy and speed. If omitted, the model detects the language automatically.

prompt

Optional. A text string to guide the model's style or provide context for the audio. Useful for supplying proper nouns, acronyms, or domain-specific vocabulary that the model might otherwise transcribe incorrectly.

temperature

Default: 0.0. Sampling temperature between 0.0 and 1.0. Lower values produce more deterministic output.

response_format

Default: "json". Controls the output format.

ValueDescription
"json"Returns {"text": "..."}
"text"Returns the transcription as a plain string

Only "json" and "text" are supported. Other values return 400 with code unsupported_response_format.


Returns

JSON (default)

{
  "text": "Hello, world. This is a transcription of the audio."
}

Text

A plain string with the transcription. The Content-Type header will be text/plain.


Errors

HTTP StatusCodeWhen
400invalid_requestFile missing, filename missing, or file is empty
400invalid_file_formatFile extension not in supported formats
400file_too_largeFile exceeds 25 MB
400unsupported_response_formatresponse_format is not "json" or "text"
401unauthorizedInvalid or missing API key
402insufficient_creditsAccount balance is zero or negative
429rate_limit_exceededToo many requests
500internal_errorUnexpected server error

Billing

  • Cost is estimated before the request and reserved from your balance.
  • After the response completes, cost is recalculated using actual token usage and the difference is refunded or deducted.
  • Failed requests are not charged.

Audio in chat completions

For conversational use cases — asking the model questions about audio content, or sending audio as part of a multi-turn conversation — use input_audio content parts in /v1/chat/completions instead. See the Completions docs for the ContentPart interface.