Zero-Shot Voice Cloning - a mrfakename Collection

mrfakename 's Collections

Zero-Shot Voice Cloning

Llamafied Models

Spaces of the Week

Failed Experiments

Zero-Shot Voice Cloning

updated 23 days ago

TTS models that support zero-shot voice cloning

MegaTTS 3: Sparse Alignment Enhanced Latent Diffusion Transformer for Zero-Shot Speech Synthesis

Paper • 2502.18924 • Published Feb 26 • 13

Note https://github.com/bytedance/MegaTTS3
MaskGCT: Zero-Shot Text-to-Speech with Masked Generative Codec Transformer

Paper • 2409.00750 • Published Sep 1, 2024 • 4

Note https://github.com/open-mmlab/Amphion
F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching

Paper • 2410.06885 • Published Oct 9, 2024 • 47

Note https://github.com/SWivid/F5-TTS
StyleTTS-ZS: Efficient High-Quality Zero-Shot Text-to-Speech Synthesis with Distilled Time-Varying Style Diffusion

Paper • 2409.10058 • Published Sep 16, 2024 • 2

Note https://github.com/yl4579/StyleTTS-ZS (Official code not released yet, still under development)
E2 TTS: Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS

Paper • 2406.18009 • Published Jun 26, 2024 • 23

Note Unofficial implementation: https://github.com/lucidrains/e2-tts-pytorch
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

Paper • 2306.07691 • Published Jun 13, 2023 • 9

Note https://github.com/yl4579/StyleTTS2
XTTS: a Massively Multilingual Zero-Shot Text-to-Speech Model

Paper • 2406.04904 • Published Jun 7, 2024 • 9

Note https://github.com/coqui-ai/TTS
Better speech synthesis through scaling

Paper • 2305.07243 • Published May 12, 2023 • 5

Note https://github.com/neonbjb/tortoise-tts