---
language:
- en
base_model:
- sesame/csm-1b
- senstella/csm-expressiva-1b
- meta-llama/Llama-3.2-1B
- Vikhrmodels/Vikhr-Llama-3.2-1B-Instruct
- fixie-ai/ultravox-v0_5-llama-3_2-1b
pipeline_tag: text-to-speech
---
**The model supports multilingual transcription, but voice output is only in English or English-like languages.**
Models:
CSM: [sesame/csm-1b](https://huggingface.co/sesame/csm-1b)
CSM-EXPRESSIVA(WHISPERING & NO VC): [senstella/csm-expressiva-1b](https://huggingface.co/senstella/csm-expressiva-1b)
LLAMA: [meta-llama/Llama-3.2-1B](https://huggingface.co/meta-llama/Llama-3.2-1B)
LLAMA-VIKHR: [Vikhrmodels/Vikhr-Llama-3.2-1B-Instruct](https://huggingface.co/Vikhrmodels/Vikhr-Llama-3.2-1B-Instruct)
LLAMA-ULTRAVOX: [fixie-ai/ultravox-v0_5-llama-3_2-1b](https://huggingface.co/fixie-ai/ultravox-v0_5-llama-3_2-1b)
### CSM:
### CSM-EXPRESSIVA(WHISPERING & NO VC):
### LLAMA:
### LLAMA-VIKHR:
### LLAMA-ULTRAVOX: