SSML Reference
Resemble accepts Speech Synthesis Markup Language (SSML) so you can precisely control pronunciation, pacing, and style. Punctuation is handled automatically, but SSML tags unlock finer control such as emphasizing a word, spelling out acronyms, or inserting pauses.
Supported Elements
<speak> Root Element
<prosody>
Use to modulate pitch, speaking rate, or volume.
<emphasis>
<say-as>
Spell out characters or indicate alternate interpretation.
<sub>
<break>
<lang>
Switch the language mid-stream if the voice supports it.
Supported xml:lang Codes
<resemble:convert>
Performs speech-to-speech using a donor recording.
Maximum file size: 50 MB, maximum duration: 300 seconds. Files exceeding either limit are trimmed.
Original Audio
Converted Audio
Prompting with Speech-to-Speech
Unlike text-to-speech where the prompt attribute is placed on the <speak> root element, for speech-to-speech conversion you must place the prompt attribute directly on the <resemble:convert> tag:
The prompt will guide how the donor audio is converted to the target voice, allowing you to adjust accent, tone, or delivery style.
