Synthesize Your First Clip | Resemble

Learn how to synthesize audio clips using the Resemble API.

Prerequisites

Resemble account with confirmed email
API key (see Authentication)
A voice UUID (use a marketplace voice or create your own)

Quick Examples

HTTP (curl)

$ curl --request POST "https://f.cluster.resemble.ai/synthesize" \
>   -H "Authorization: YOUR_API_KEY" \
>   -H "Content-Type: application/json" \
>   --data '{
>     "voice_uuid": "YOUR_VOICE_UUID",
>     "data": "Hello from Resemble!",
>     "sample_rate": 22050,
>     "output_format": "wav"
>   }' \
>   | jq -r '.audio_content' | base64 --decode > output.wav

Node.js

$ npm install @resemble/node

1 import { Resemble } from "@resemble/node";
2 
3 Resemble.setApiKey(process.env.RESEMBLE_API_KEY);
4 
5 const response = await Resemble.v2.clips.createSync("YOUR_PROJECT_UUID", {
6   voice_uuid: "YOUR_VOICE_UUID",
7   body: "Hello from Resemble!",
8   title: "My First Clip"
9 });
10 
11 console.log(response.item.audio_src);

Python

$ pip install resemble

1 from resemble import Resemble
2 import os
3 
4 Resemble.api_key(os.environ.get("RESEMBLE_API_KEY"))
5 
6 response = Resemble.v2.clips.create_sync(
7     "YOUR_PROJECT_UUID",
8     "YOUR_VOICE_UUID",
9     "Hello from Resemble!",
10     title="My First Clip"
11 )
12 
13 print(response["item"]["audio_src"])

Response

The synthesis server returns:

1 {
2   "success": true,
3   "audio_content": "<base64-encoded-audio>",
4   "audio_timestamps": {
5     "graph_chars": ["H", "e", "l", "l", "o", " ", "f", "r", "o", "m", " ", "R", "e", "s", "e", "m", "b", "l", "e", "!"],
6     "graph_times": [[0.08, 0.12], [0.12, 0.18], [0.18, 0.24], [0.24, 0.30], [0.30, 0.36], [0.36, 0.40], [0.40, 0.46], [0.46, 0.52], [0.52, 0.58], [0.58, 0.64], [0.64, 0.68], [0.68, 0.74], [0.74, 0.80], [0.80, 0.86], [0.86, 0.92], [0.92, 0.98], [0.98, 1.04], [1.04, 1.10], [1.10, 1.16], [1.16, 1.68]],
7     "phon_chars": [],
8     "phon_times": []
9   },
10   "duration": 1.68,
11   "issues": [],
12   "output_format": "wav",
13   "sample_rate": 22050,
14   "seed": 3389672177,
15   "synth_duration": 1.68,
16   "title": null
17 }

Decode audio_content from base64 to get the raw audio bytes.

Next Steps

Learn about streaming synthesis for lower latency
Explore voice cloning to create custom voices
See SSML Reference for advanced speech control