Vocoder with HiFIGAN trained on custom German dataset

This repository provides all the necessary tools for using a HiFIGAN vocoder trained on a generated German dataset using mp3_to_training_data.

The pre-trained model (8 epochs so far) takes in input a spectrogram and produces a waveform in output. Typically, a vocoder is used after a TTS model that converts an input text into a spectrogram.

How to use

Install speechbrain.

pip install speechbrain

Use a TTS model (e.g. tts-tacotron-german), generate a spectrogram and convert it to audio.

import torchaudio
from speechbrain.pretrained import Tacotron2
from speechbrain.pretrained import HIFIGAN

tacotron2 = Tacotron2.from_hparams(source="padmalcom/tts-tacotron2-german", savedir="tmpdir_tts")
hifi_gan = HIFIGAN.from_hparams(source="padmalcom/tts-hifigan-german", savedir="tmpdir_vocoder")

mel_output, mel_length, alignment = tacotron2.encode_text("Mary had a little lamb")

waveforms = hifi_gan.decode_batch(mel_output)

torchaudio.save('example_TTS.wav',waveforms.squeeze(1), 22050)

Inference on GPU

To perform inference on the GPU, add run_opts={"device":"cuda"} when calling the from_hparams method.

Downloads last month
140
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model authors have turned it off explicitly.