Speech-to-Speech
Speech-to-Speech
Speech-to-Speech
Convert a donor recording into a target voice while preserving delivery and timing. Speech-to-speech uses the same synthesis endpoint as synchronous TTS, but you pass SSML that references a source recording.
POST https://f.cluster.resemble.ai/synthesize
Identical to synchronous TTS responses:
<resemble:convert> AttributesTip: Store donor files in cloud storage with signed URLs and revoke them after synthesis completes.
You can use the prompt attribute to guide how the donor audio is converted to the target voice. Unlike text-to-speech where the prompt is placed on the <speak> root element, for speech-to-speech conversion you must place the prompt attribute directly on the <resemble:convert> tag.
The prompt attribute allows you to adjust:
This provides fine-grained control over how the donor audio’s delivery is transformed while maintaining the original timing and prosody.