metadata

license: mit
language:
  - en
base_model:
  - coqui/XTTS-v1

Fine-Tuned Xtts Model

This project fine-tunes a TTS (Text-to-Speech) model using an mp3 file extracted from a YouTube video. The training was conducted on a Hugging Face Space running locally via Docker. A GPU is recommended for faster training.

Training Data

Source Video: YouTube Video
Training Audio: The mp3 file used for training is included in the files directory.

Hugging Face Space

The fine-tuning process is based on the Hugging Face Space found here:
FineTune Xtts Space

Docker Setup

With GPU

To run the training with GPU support:

docker run -it -p 7860:7860 --gpus all --pull always --platform=linux/amd64 registry.hf.space/drewthomasson-finetune-xtts:latest python app.py

Without GPU

To run without GPU support:

docker run -it -p 7860:7860 --pull always --platform=linux/amd64 registry.hf.space/drewthomasson-finetune-xtts:latest python app.py

Notes

Ensure you have a GPU available for optimal performance during training.
The Docker image pulls the latest version each time it's run.