Aurora-1.6B: Multilingual Emotion and Singing TTS Model

A fine-tuned version of Dia-1.6B trained on multilingual and singing datasets, with full emotion control and zero-shot voice cloning.

Features

  • Multilingual Support
    Natural speech in Italian, English, Polish, German, French, and more.
  • Emotion Control
    Use speaker tags or emotion tokens (e.g. [S1], [happy], [sad]) to modulate expressiveness.
  • Singing Capabilities
    Generate melodic vocals by providing singing prompts or style references.
  • Zero-Shot Voice Cloning
    Clone any speaker’s voice from a short audio sample.
  • Nonverbal Vocalizations
    Embed realistic effects like (laughs), (coughs), or (sighs) inline.

Usage

from dia.model import Dia
import soundfile as sf

# Load the Aurora-1.6B model
model = Dia.from_pretrained("Lorenzob/aurora-1.6b")

# Generate a happy spoken line followed by singing
text = "[S1][happy] Hello world! Now sing 'Happy Birthday to You'"
audio = model.generate(text)

# Save output at 44.1 kHz
sf.write("output.wav", audio, 44100)
Downloads last month
55
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Lorenzob/aurora-1.6b

Base model

nari-labs/Dia-1.6B
Finetuned
(14)
this model