Voice Design creates AI-generated voices from text descriptions—no audio recordings required. Describe the voice you want, choose from three candidates, and start generating speech instantly.
Voice Design uses AI to generate synthetic voices based on your text prompt. Instead of cloning an existing voice from audio samples, you describe characteristics like age, accent, tone, and style, and the system creates unique voices matching your description.
Perfect for:
Describe the voice you want. Be specific about:
Example prompts:
Need help writing effective prompts? See our Prompting Guide for tips and best practices.
Call the Generate Voice Candidates endpoint with your prompt:
You’ll receive 3 different voice candidates that match your description:
Important: All three candidates share the same uuid (they’re from the same generation request). The voice_sample_index (0, 1, or 2) identifies which candidate is which.
audio_url samplesvoice_sample_index (0, 1, or 2)uuid from the responseConvert your chosen candidate into a usable voice with Create Voice from Candidate:
Example (choosing candidate #1):
Response:
The voice builds automatically in the background. You can use it immediately, even while it’s still training.
Use the voice_uuid to generate audio with the TTS endpoint:
The voice is immediately usable in: