Synchronous text-to-speech synthesis | Resemble

Generate speech synchronously from text or SSML. Returns complete audio as base64.

Authentication

AuthorizationBearer

API token from https://app.resemble.ai/account/api

Request

This endpoint expects an object.

voice_uuidstringRequired

Voice UUID to use for synthesis

datastringRequired

Text or SSML to synthesize (max 3,000 characters)

project_uuidstringOptional

Optional project UUID to store the clip

titlestringOptional

Optional title for the generated clip

modelstringOptional

Model to use for synthesis. Pass chatterbox-turbo to use the Turbo model for lower latency and paralinguistic tag support. If not specified, defaults to Chatterbox or Chatterbox Multilingual based on the voice. Note - Chatterbox-Turbo is supported by all Rapid English voices and Pre Built Library voices.

precisionenumOptionalDefaults to PCM_32

Audio precision for WAV output

Allowed values:

output_formatenumOptionalDefaults to wav

Audio output format

Allowed values:

sample_rateenumOptional

Audio sample rate in Hz

use_hdbooleanOptionalDefaults to false

Enable HD synthesis with small latency trade-off

Response

Successful synthesis

successboolean or null

audio_contentstring or nullformat: "byte"

Base64-encoded audio bytes

audio_timestampsobject or null

durationdouble or null

Audio duration in seconds

synth_durationdouble or null

Raw synthesis time

output_formatstring or null

sample_rateinteger or null

titlestring or null

issueslist of strings or null

1	import requests
2
3	url = "https://f.cluster.resemble.ai/synthesize"
4
5	payload = {
6	"voice_uuid": "55592656",
7	"data": "Hello from Resemble!"
8	}
9	headers = {
10	"Authorization": "Bearer <token>",
11	"Content-Type": "application/json"
12	}
13
14	response = requests.post(url, json=payload, headers=headers)
15
16	print(response.json())

1	{
2	"success": true,
3	"audio_content": "string",
4	"audio_timestamps": {
5	"graph_chars": [
6	"string"
7	],
8	"graph_times": [
9	[
10	1.1
11	]
12	],
13	"phon_chars": [
14	"string"
15	],
16	"phon_times": [
17	[
18	1.1
19	]
20	]
21	},
22	"duration": 1.1,
23	"synth_duration": 1.1,
24	"output_format": "wav",
25	"sample_rate": 48000,
26	"title": "string",
27	"issues": [
28	"string"
29	]
30	}

Authentication

Request

Response

Errors