Aurora-1.6B: Multilingual Emotion and Singing TTS Model
A fine-tuned version of Dia-1.6B trained on multilingual and singing datasets, with full emotion control and zero-shot voice cloning.
Features
- Multilingual Support
Natural speech in Italian, English, Polish, German, French, and more. - Emotion Control
Use speaker tags or emotion tokens (e.g.[S1]
,[happy]
,[sad]
) to modulate expressiveness. - Singing Capabilities
Generate melodic vocals by providing singing prompts or style references. - Zero-Shot Voice Cloning
Clone any speaker’s voice from a short audio sample. - Nonverbal Vocalizations
Embed realistic effects like(laughs)
,(coughs)
, or(sighs)
inline.
Usage
from dia.model import Dia
import soundfile as sf
# Load the Aurora-1.6B model
model = Dia.from_pretrained("Lorenzob/aurora-1.6b")
# Generate a happy spoken line followed by singing
text = "[S1][happy] Hello world! Now sing 'Happy Birthday to You'"
audio = model.generate(text)
# Save output at 44.1 kHz
sf.write("output.wav", audio, 44100)
- Downloads last month
- 55
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for Lorenzob/aurora-1.6b
Base model
nari-labs/Dia-1.6B