|
--- |
|
license: mit |
|
language: |
|
- en |
|
base_model: |
|
- coqui/XTTS-v1 |
|
--- |
|
|
|
# Fine-Tuned Xtts Model |
|
|
|
This project fine-tunes a TTS (Text-to-Speech) model using an mp3 file extracted from a YouTube video. The training was conducted on a Hugging Face Space running locally via Docker. A GPU is recommended for faster training. |
|
|
|
### Training Data |
|
- **Source Video**: [YouTube Video](https://www.youtube.com/watch?v=u6J20_Aem3Y) |
|
- **Training Audio**: The mp3 file used for training is included in the `files` directory. |
|
|
|
### Hugging Face Space |
|
The fine-tuning process is based on the Hugging Face Space found here: |
|
[FineTune Xtts Space](https://huggingface.co/spaces/drewThomasson/FineTune_Xtts) |
|
|
|
### Docker Setup |
|
|
|
#### With GPU |
|
To run the training with GPU support: |
|
```bash |
|
docker run -it -p 7860:7860 --gpus all --pull always --platform=linux/amd64 registry.hf.space/drewthomasson-finetune-xtts:latest python app.py |
|
``` |
|
|
|
#### Without GPU |
|
To run without GPU support: |
|
```bash |
|
docker run -it -p 7860:7860 --pull always --platform=linux/amd64 registry.hf.space/drewthomasson-finetune-xtts:latest python app.py |
|
``` |
|
|
|
### Notes |
|
- Ensure you have a GPU available for optimal performance during training. |
|
- The Docker image pulls the latest version each time it's run. |