Synchronous

The synchronous endpoint processes the entire input and returns a single audio payload—ideal for short prompts, notifications, or background jobs that do not require streaming.

Quick Example

$curl --request POST "https://f.cluster.resemble.ai/synthesize" \
> -H "Authorization: YOUR_API_KEY" \
> -H "Content-Type: application/json" \
> -H "Accept-Encoding: gzip" \
> --data '{
> "voice_uuid": "55592656",
> "data": "Hello from Resemble!",
> "sample_rate": 48000,
> "output_format": "wav"
> }'

Decode the audio_content field from base64 to retrieve the raw audio bytes.

Endpoint

POST https://f.cluster.resemble.ai/synthesize

Request Headers

HeaderValueDescription
AuthorizationYOUR_API_KEYAPI key from the dashboard.
Content-Typeapplication/jsonJSON request body.
Accept-Encodinggzip, deflate, or brOptional compression.

Request Body

FieldTypeRequiredDescription
voice_uuidstringYesVoice to synthesize.
datastringYesText or SSML to synthesize (≤ 2,000 characters).
project_uuidstringNoProject to store the clip in.
titlestringNoTitle for the generated clip.
precisionstringNoMULAW, PCM_16, PCM_24, PCM_32 (default). Applies to WAV output.
output_formatstringNowav (default) or mp3.
sample_ratenumberNo8000, 16000, 22050, 32000, 44100, or 48000. Defaults to 48000.

Response

1{
2 "audio_content": "<base64>",
3 "audio_timestamps": {
4 "graph_chars": ["H", "e", "l", "l", "o"],
5 "graph_times": [[0.0, 0.12], [0.12, 0.24], ...],
6 "phon_chars": [],
7 "phon_times": []
8 },
9 "duration": 1.68,
10 "issues": [],
11 "output_format": "wav",
12 "sample_rate": 48000,
13 "seed": 962692783,
14 "success": true,
15 "synth_duration": 1.64,
16 "title": null
17}
FieldTypeDescription
audio_contentstringBase64-encoded audio bytes.
audio_timestampsobjectTimestamp arrays for graphemes and phonemes. Grapheme timestamps (graph_chars, graph_times) are supported for all models, with times in seconds as [start, end] pairs. Phoneme timestamps (phon_chars, phon_times) return empty arrays for newer models and are only populated by legacy models.
durationnumberFinal clip duration in seconds.
issuesarrayIssues related to the request.
output_formatstringEchoes the requested format.
sample_ratenumberEchoes the requested sample rate.
seednumberRandom seed used for generation.
successbooleanWhether the synthesis succeeded.
synth_durationnumberRaw synthesis time prior to post-processing.
titlestring | nullTitle saved with the clip, or null if not provided.

Try it – Repeat the request above and decode audio_content locally:

$curl --request POST "https://f.cluster.resemble.ai/synthesize" \
> -H "Authorization: YOUR_API_KEY" \
> -H "Content-Type: application/json" \
> --data '{
> "voice_uuid": "55592656",
> "data": "Hello from Resemble!",
> "sample_rate": 48000,
> "output_format": "wav"
> }' \
>| jq -r '.audio_content' | base64 --decode > output.wav