--- language: - en base_model: - sesame/csm-1b - senstella/csm-expressiva-1b - meta-llama/Llama-3.2-1B - Vikhrmodels/Vikhr-Llama-3.2-1B-Instruct - fixie-ai/ultravox-v0_5-llama-3_2-1b pipeline_tag: text-to-speech --- **The model supports multilingual transcription, but voice output is only in English or English-like languages.**     Models: CSM: [sesame/csm-1b](https://huggingface.co/sesame/csm-1b) CSM-EXPRESSIVA(WHISPERING & NO VC): [senstella/csm-expressiva-1b](https://huggingface.co/senstella/csm-expressiva-1b) LLAMA: [meta-llama/Llama-3.2-1B](https://huggingface.co/meta-llama/Llama-3.2-1B) LLAMA-VIKHR: [Vikhrmodels/Vikhr-Llama-3.2-1B-Instruct](https://huggingface.co/Vikhrmodels/Vikhr-Llama-3.2-1B-Instruct) LLAMA-ULTRAVOX: [fixie-ai/ultravox-v0_5-llama-3_2-1b](https://huggingface.co/fixie-ai/ultravox-v0_5-llama-3_2-1b)   ### CSM:   ### CSM-EXPRESSIVA(WHISPERING & NO VC):   ### LLAMA:   ### LLAMA-VIKHR:   ### LLAMA-ULTRAVOX: