Voice Settings Presets

Voice Settings Presets allow you to save and reuse combinations of voice settings (pace, pitch, temperature, etc.) for text-to-speech synthesis. Presets make it easy to maintain consistent voice styles across multiple synthesis requests without having to specify individual settings each time.

Use Cases

  • Consistent Branding: Create presets for different content types (ads, narration, customer service)
  • Quick Experimentation: Save different voice styles and switch between them easily
  • Personal Library: Build your own collection of voice configurations for different projects
  • Efficient Workflow: Reuse proven voice configurations without remembering exact parameter values

Limitations

  • Maximum 5 custom presets per user
  • Preset names must be unique among your active presets
  • Default/public presets (created by Resemble) cannot be modified or deleted
  • Default presets do not count towards your 5 preset limit

Settings Parameters

Voice Settings Presets support the following parameters:

pace (0.2 - 2.0)

Controls speech speed. Default: 1.0

  • < 1.0: Slower speech
  • 1.0: Normal speed
  • > 1.0: Faster speech

temperature (0.1 - 5.0)

Controls voice variation and randomness. Default: 0.8

  • Lower values: More consistent, predictable voice
  • Higher values: More varied, expressive voice

pitch (-10 to 10)

Adjusts voice pitch (only applied for Voice Conversion/STS). Default: 0.0

  • Negative values: Lower pitch
  • Positive values: Higher pitch

useHd (boolean)

Enable high-definition audio quality. Default: false

  • true: HD quality (may cost more credits)
  • false: Standard quality

exaggeration (0.0 - 1.0)

Controls emotional expressiveness. Default: 0.5

  • 0.0: Minimal emotion
  • 1.0: Maximum emotional emphasis

description (string, max 1000 characters)

Text prompt describing desired voice style. Default: ""

  • Examples: “Speak in a calm and soothing tone”, “Sound excited and energetic”

Using Presets with Synthesis

Once created, include the voice_settings_preset_uuid parameter in your synthesis requests:

1{
2 "voice_uuid": "your-voice-uuid",
3 "data": "Hello from Resemble!",
4 "sample_rate": 48000,
5 "voice_settings_preset_uuid": "123e4567-e89b-12d3-a456-426614174000"
6}

Important: Preset Priority Over SSML

When using voice_settings_preset_uuid, preset settings will OVERRIDE any equivalent SSML settings in your data field.

If you need fine-grained SSML control over pace, pitch, or other voice settings, do NOT use a preset UUID. Instead, pass your SSML in the data field without the voice_settings_preset_uuid parameter.

Resource Schema

1interface VoiceSettingsPreset {
2 uuid: string;
3 name: string;
4 settings: {
5 pace: number;
6 temperature: number;
7 pitch: number;
8 useHd: boolean;
9 exaggeration: number;
10 description: string;
11 };
12 is_public: boolean;
13 created_at: string;
14 updated_at: string;
15}