Spark TTS finetuned in genshin charactors voices.
- github code: https://github.com/nonwesjoe/genshin-sparktts
- kaggle notebook: https://www.kaggle.com/code/suziwsz/genshin-sparktts/
Available charactors
- paimon, hutao, furina, kazuha, xiao, mona, ganyu, xiangling, shotgun, citlali, barbara, zhongli, venti, nahida, kaeya, yaoyao, yoimiya, nilou.(each charactor in one full finetuned model)
Usage
- python 3.12 suggested
- git clone https://github.com/nonwesjoe/genshin-sparktts.git && cd genshin-sparktts
- when cuda is availabel, install torch 2.7.1 on cuda
pip install torch torchaudio torchvision -i https://download.pytorch.org/whl/cu118/
else, install torch 2.7.1 on cpupip install torch torchaudio torchvision -i https://download.pytorch.org/whl/cpu
- install other requirements
pip install -r requirements.txt
- in terminal set some environment variables
export CHARACTOR=nahida # or other charactors
export MODEL_PATH=/kaggle/working/genshin/ # your model path
export INPUT_TEXT="楼下发荔枝了吗?那我们快去领取!" # text to be converted
- download model files: defaultly, download one specific charactor model set in environment variable CHARACTOR. model will be download in ./genshin
python3 download.py
- run code to convert text to audio. audio outputs sparktts.wav.
python3 run.py
Detail
- this model is trianed on float32 but saved as float16 for less VRAM and Storage usage.
Example
▶ Furina(芙宁娜) play
▶ Kazuha(万叶) play
▶ Paimon(派蒙) play
▶ Hutao(胡桃) play
▶ Xiao(魈) play
▶ Citlali(茜特菈莉) play
Acknowledgement
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for wesjos/spark-tts-genshin-charactors
Base model
SparkAudio/Spark-TTS-0.5B