MeloTTS Model Checkpoint
This repository contains trained model checkpoints for MeloTTS, a high-quality multi-lingual text-to-speech system. These checkpoints are part of a trained model that can be used for text-to-speech synthesis.
Model Details
- Model Type: MeloTTS
- Language Support: English (Default)
- Sampling Rate: 44.1kHz
- Mel Channels: 128
- Hidden Channels: 192
- Filter Channels: 768
Architecture Details
- Inter channels: 192
- Number of heads: 2
- Number of layers: 6
- Flow layers: 3
- Kernel size: 3
- Dropout rate: 0.1
Training Dataset
This model was trained on the Jenny TTS Dataset, which is available on Hugging Face. The dataset consists of high-quality English speech recordings suitable for text-to-speech training.
Model Files
The repository contains several checkpoint files:
DUR_*.pth
: Duration predictor checkpointsG_*.pth
: Generator model checkpointsD_*.pth
: Discriminator model checkpointsconfig.json
: Model configuration file
Usage
To use this model with MeloTTS:
from melo.api import TTS
# Initialize TTS with the model path
tts = TTS(model_path="kadirnar/melotts-model")
# Generate speech
tts.tts_to_file(
text="Your text here",
speaker="EN-default",
language="EN",
output_path="output.wav"
)
Training Details
The model was trained with the following specifications:
- Batch size: 6
- Learning rate: 0.0003
- Beta values: [0.8, 0.99]
- Segment size: 16384
Original Repository
This model is based on MeloTTS by MyShell.ai. Visit the original repository for more details about the architecture and implementation.
License
This model follows the same licensing as the original MeloTTS repository (MIT License).
- Downloads last month
- 13
Model tree for kadirnar/melotts-jenny
Base model
myshell-ai/MeloTTS-English