HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis Paper • 2010.05646 • Published Oct 12, 2020 • 1
Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis Paper • 2306.00814 • Published Jun 1, 2023 • 4
Mixer-TTS: non-autoregressive, fast and compact text-to-speech model conditioned on language model embeddings Paper • 2110.03584 • Published Oct 7, 2021 • 1
Running on Zero Agents Featured 1.04k OmniVoice 🌍 1.04k High-quality voice cloning TTS for 600+ languages
worstchan/EAT-base_epoch30_finetune_AS2M Image Feature Extraction • 90.4M • Updated May 6, 2025 • 33.5k • 3
worstchan/EAT-large_epoch20_pretrain Image Feature Extraction • 0.3B • Updated May 6, 2025 • 1.05k • 1
Arabic TTS with FastPitch: Reproducible Baselines, Adversarial Training, and Oversmoothing Analysis Paper • 2512.00937 • Published Nov 30, 2025
laion/laions_got_talent_enhanced_flash_annotations_and_long_captions Updated Mar 19, 2025 • 2.72k • 18