Transcriptions

Transcribe audio files to text using the Sansa API

Converts an audio file to text. Compatible with the OpenAI audio.transcriptions SDK client. Sansa routes the file to a transcription-optimized model automatically.

POST /v1/audio/transcriptions

The request body must be multipart/form-data. See the code panel for examples in Python, TypeScript, and curl.

Form Parameters

file

Required. The audio file to transcribe. Must be multipart/form-data.

Supported formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, webm.

Maximum file size: 25 MB.

model

Optional. Any value is accepted — Sansa ignores it and always routes to the appropriate transcription model. Pass "whisper-1" or "gpt-4o-transcribe" for OpenAI SDK compatibility.

language

Optional. ISO-639-1 language code (e.g., "en", "fr", "de"). Providing the language improves accuracy and speed. If omitted, the model detects the language automatically.

prompt

Optional. A text string to guide the model's style or provide context for the audio. Useful for supplying proper nouns, acronyms, or domain-specific vocabulary that the model might otherwise transcribe incorrectly.

temperature

Default: 0.0. Sampling temperature between 0.0 and 1.0. Lower values produce more deterministic output.

response_format

Default: "json". Controls the output format.

Value	Description
`"json"`	Returns `{"text": "..."}`
`"text"`	Returns the transcription as a plain string

Only "json" and "text" are supported. Other values return 400 with code unsupported_response_format.

Returns

JSON (default)

{
  "text": "Hello, world. This is a transcription of the audio."
}

Text

A plain string with the transcription. The Content-Type header will be text/plain.

Errors

HTTP Status	Code	When
`400`	`invalid_request`	File missing, filename missing, or file is empty
`400`	`invalid_file_format`	File extension not in supported formats
`400`	`file_too_large`	File exceeds 25 MB
`400`	`unsupported_response_format`	`response_format` is not `"json"` or `"text"`
`401`	`unauthorized`	Invalid or missing API key
`402`	`insufficient_credits`	Account balance is zero or negative
`429`	`rate_limit_exceeded`	Too many requests
`500`	`internal_error`	Unexpected server error

Billing

Cost is estimated before the request and reserved from your balance.
After the response completes, cost is recalculated using actual token usage and the difference is refunded or deducted.
Failed requests are not charged.

Audio in chat completions

For conversational use cases — asking the model questions about audio content, or sending audio as part of a multi-turn conversation — use input_audio content parts in /v1/chat/completions instead. See the Completions docs for the ContentPart interface.