LAION-Voice (LAION Voice)

posted an update 6 months ago

Post

25512

Excited to share that I've joined the Hugging Face Fellows program! 🤗

Looking forward to contributing to & working more closely with the open-source ecosystem - huge thanks to everyone who's supported me on this journey! 🚀

ChristophSchuhmann

published a dataset 6 months ago

LAION-Voice/cuda-docker

Updated Dec 3, 2025 • 4

ChristophSchuhmann

updated a dataset 6 months ago

LAION-Voice/laions-got-talent_wordlevel-annotation

Updated Nov 26, 2025 • 94

ChristophSchuhmann

published a dataset 7 months ago

LAION-Voice/laions-got-talent_wordlevel-annotation

Updated Nov 26, 2025 • 94

mrfakename

posted an update 7 months ago

Post

6435

Trained a model for emotion-controllable TTS based on MiMo audio on LAION's dataset.

Still very early and does have an issue with hallucinating but results seem pretty good so far, given that it is very early into the training run.

Will probably kick off a new run later with some settings tweaked.

Put up a demo here: https://huggingface.co/spaces/mrfakename/EmoAct-MiMo

(Turn 🔊 on to hear audio samples)

5 replies

·

gijs

authored a paper 8 months ago

AudioToolAgent: An Agentic Framework for Audio-Language Models

Paper • 2510.02995 • Published Oct 3, 2025

laion-voice-bot

updated a dataset 9 months ago

LAION-Voice/LAION-Voice-WIP

Viewer • Updated Sep 13, 2025 • 1.62M • 3 • 1

mrfakename

published a dataset 9 months ago

LAION-Voice/LAION-Voice-WIP

Viewer • Updated Sep 13, 2025 • 1.62M • 3 • 1

kadirnar

posted an update 9 months ago

Post

3052

What can you do with the VyvoTTS library?

- You can train a model in a language it has never been trained in using the PT model. There’s no need for large datasets.
- With the PT model, you can easily replicate the voice of any character you want. Just 1k samples are enough.
- You can add emotion support with a small dataset.

Github: https://github.com/Vyvo-Labs/VyvoTTS
HuggingFace:

Vyvo

HarryHe

authored 2 papers 10 months ago

Overview of the Amphion Toolkit (v0.2)

Paper • 2501.15442 • Published Jan 26, 2025 • 4

Fact2Fiction: Targeted Poisoning Attack to Agentic Fact-checking System

Paper • 2508.06059 • Published Aug 8, 2025 • 4

gijs

authored 4 papers 10 months ago

Audio-Language Datasets of Scenes and Events: A Survey

Paper • 2407.06947 • Published Jul 9, 2024

AudSemThinker: Enhancing Audio-Language Models through Reasoning over Semantics of Sound

Paper • 2505.14142 • Published May 20, 2025

Data-Balanced Curriculum Learning for Audio Question Answering

Paper • 2507.06815 • Published Jul 9, 2025

ACES: Evaluating Automated Audio Captioning Models on the Semantics of Sounds

Paper • 2403.18572 • Published Mar 27, 2024

mrfakename

posted an update about 1 year ago

Post

3819

Papla P1 from Papla Media is now available on the TTS Arena!

Try out Papla's new ultra-realistic TTS model + compare it with other leading models on the TTS Arena: TTS-AGI/TTS-Arena