97+ MODELS LIVE — TRY THE AUDIO LAB
STUDIO
Audiominimax-2.3minimax-voice-clone

Minimax Voice Clone

Minimax Voice Clone creates a high-fidelity digital clone of a speaker’s voice from a short reference audio sample. It reproduces the speaker’s tone, emotion, accent, rhythm, and speaking style, then generates new speech from any text input.

MiniMax guideVerified

MiniMax's audio lines. Text-to-speech (Speech 2.6 HD / Turbo) and voice cloning. Distinct from MiniMax's Hailuo video brand.

About this variant

Voice cloning from a ~30s reference recording.

Strengths
  • Speech 2.6 HD generates natural-sounding voices with emotion and intonation.
  • Voice clone learns from a short reference audio (~30s).
  • Turbo variant for fast iteration / drafts.
Weaknesses
  • English voices are stronger than non-English.
  • Voice clone may not perfectly match the source on extreme tones.
Best for
  • Voiceover for shorts
  • Custom voice characters for animation / games
  • Long-form narration
Parameters
  • noise_reduction
    boolean
    Enable noise reduction for the cloned voice
    default: false
  • model
    string
    TTS model to use for preview. Options: speech-02-hd, speech-02-turbo, speech-01-hd, speech-01-turbo
    speech-02-hdspeech-02-turbospeech-01-hdspeech-01-turbo
    default: speech-02-hd
  • audio_url
    string
    URL of the input audio file for voice cloning. Should be at least 10 seconds long. To retain the voice permanently, use it with a TTS (text-to-speech) endpoint at least once within 7 days. Otherwise, it will be automatically deleted.
  • need_volume_normalization
    boolean
    Enable volume normalization for the cloned voice
    default: false
  • text
    string
    Text to generate a TTS preview with the cloned voice (optional)
    default: Hello, this is a preview of your cloned voice! I hope you like it!
  • accuracy
    number
    Text validation accuracy threshold (0-1)
You'll need
  • Reference audio
Try now

More from minimax-2.3