YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
Work In Progress
- This is a finetuned checkpoint of HKUSTAudio/Llasa-1B-Multilingual, on Cantonese audio data
- Two additional tokens are added
<|YUE_START|>
and<|YUE_END|>
. The chat template is
formatted_text = f"<|TEXT_UNDERSTANDING_START|><|YUE_START|>{input_text}<|YUE_END|><|TEXT_UNDERSTANDING_END|>"
chat = [
{"role": "user", "content": "Convert the text to speech:" + formatted_text},
{"role": "assistant", "content": "<|SPEECH_GENERATION_START|>" + ''.join(speech_ids_prefix)}
]
Roadmap
- Train on more data
- Train with emotions, speaker characteristics (gender, age)
- Benchmark with CER
- Gradio space
- Train with LayerSkip
- Train on better filtered data
- Release training code
- Downloads last month
- 5
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support