For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
Status
OverviewProductsManageAPI ReferenceTutorialsClient Libraries
OverviewProductsManageAPI ReferenceTutorialsClient Libraries
    • Overview
  • Quickstarts
    • Synthesize Your First Clip
    • Prompted Synthesis
  • Voice Creation
    • Voice Design Guide
  • WebSocket Streaming
    • Getting Started
    • Receiving Audio Data
    • Python Example
    • Error Handling
  • Prompt Engineering
    • Voice Design Prompting
    • SSML Prompts
Status
LogoLogo
On this page
  • Request Payload
  • JSON Frames
  • Binary Frames
  • Termination Message
WebSocket Streaming

Receiving Audio Data

Was this page helpful?
Previous

Python Streaming Example

Next
Built with
  1. Connect to wss://websocket.cluster.resemble.ai/stream
  2. Send a synthesis request (JSON payload)
  3. Stream audio frames (JSON or binary)
  4. Wait for the terminal audio_end message

Request Payload

1{
2 "voice_uuid": "<voice_uuid>",
3 "data": "<text or SSML>",
4 "binary_response": false,
5 "request_id": 0,
6 "output_format": "wav",
7 "sample_rate": 32000,
8 "precision": "PCM_32",
9 "no_audio_header": false
10}
FieldRequiredDescription
voice_uuid✅Voice used for synthesis.
data✅Text or SSML.
binary_response❌false for JSON frames (base64 audio); true for raw bytes.
output_format❌wav (default) or mp3.
sample_rate❌8000, 16000, 22050, 32000, or 44100.
precision❌PCM bit depth (PCM_32, PCM_24, PCM_16, MULAW).
no_audio_header❌Skip the WAV header when streaming PCM.
request_id❌Optional integer echoed in responses.

JSON Frames

1{
2 "type": "audio",
3 "audio_content": "<base64>",
4 "audio_timestamps": {
5 "graph_chars": ["H", "e"],
6 "graph_times": [[0.0374, 0.1247], [0.0873, 0.1746]],
7 "phon_chars": ["h", "ˈe"],
8 "phon_times": [[0.0374, 0.1247], [0.0873, 0.1746]]
9 },
10 "sample_rate": 32000,
11 "request_id": 0
12}

Binary Frames

When binary_response = true, frames contain contiguous audio bytes. Include a WAV header (default) or set no_audio_header = true if you want raw PCM chunks.

Termination Message

1{
2 "type": "audio_end",
3 "request_id": 0
4}

Handle the terminal message to cleanly stop playback and reset application state.