Streaming text-to-speech synthesis (HTTP)
Stream audio as it's generated. Returns chunked WAV data for progressive playback.
Authentication
AuthorizationBearer
API token from https://app.resemble.ai/account/api
Request
This endpoint expects an object.
voice_uuid
Voice UUID to use for synthesis
data
Text or SSML to synthesize (max 2000 characters)
project_uuid
Optional project UUID to store the clip
model
Model to use for synthesis. Pass chatterbox-turbo to use the Turbo model for lower latency and paralinguistic tag support. If not specified, defaults to Chatterbox or Chatterbox Multilingual based on the voice. Note - Chatterbox-Turbo is supported by all Rapid English voices and Pre Built Library voices.
precision
Audio precision
Allowed values:
sample_rate
Audio sample rate in Hz
use_hd
Enable HD synthesis with small latency trade-off
Response
Streaming audio response (chunked WAV)
