facebook
/

fastspeech2-en-200_speaker-cv4

Model card Files Files and versions Community

fastspeech2-en-200_speaker-cv4 / README.md

Changhan's picture

Update README.md

e0218c2 almost 3 years ago

|

history blame contribute delete

2.21 kB

	---
	library_name: fairseq
	task: text-to-speech
	tags:
	- fairseq
	- audio
	- text-to-speech
	- multi-speaker
	language: en
	datasets:
	- common_voice
	widget:
	- text: "Hello, this is a test run."
	example_title: "Hello, this is a test run."
	---
	# fastspeech2-en-200_speaker-cv4

	[FastSpeech 2](https://arxiv.org/abs/2006.04558) text-to-speech model from fairseq S^2 ([paper](https://arxiv.org/abs/2109.06912)/[code](https://github.com/pytorch/fairseq/tree/main/examples/speech_synthesis)):
	- English
	- 200 male/female voices (random speaker when using the widget)
	- Trained on [Common Voice v4](https://commonvoice.mozilla.org/en/datasets)

	## Usage

	```python
	from fairseq.checkpoint_utils import load_model_ensemble_and_task_from_hf_hub
	from fairseq.models.text_to_speech.hub_interface import TTSHubInterface
	import IPython.display as ipd


	models, cfg, task = load_model_ensemble_and_task_from_hf_hub(
	"facebook/fastspeech2-en-200_speaker-cv4",
	arg_overrides={"vocoder": "hifigan", "fp16": False}
	)
	model = models[0]
	TTSHubInterface.update_cfg_with_data_cfg(cfg, task.data_cfg)
	generator = task.build_generator(model, cfg)

	text = "Hello, this is a test run."

	sample = TTSHubInterface.get_model_input(task, text)
	wav, rate = TTSHubInterface.get_prediction(task, model, generator, sample)

	ipd.Audio(wav, rate=rate)
	```

	See also [fairseq S^2 example](https://github.com/pytorch/fairseq/blob/main/examples/speech_synthesis/docs/common_voice_example.md).

	## Citation

	```bibtex
	@inproceedings{wang-etal-2021-fairseq,
	title = "fairseq S{\^{}}2: A Scalable and Integrable Speech Synthesis Toolkit",
	author = "Wang, Changhan and
	Hsu, Wei-Ning and
	Adi, Yossi and
	Polyak, Adam and
	Lee, Ann and
	Chen, Peng-Jen and
	Gu, Jiatao and
	Pino, Juan",
	booktitle = "Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing: System Demonstrations",
	month = nov,
	year = "2021",
	address = "Online and Punta Cana, Dominican Republic",
	publisher = "Association for Computational Linguistics",
	url = "https://aclanthology.org/2021.emnlp-demo.17",
	doi = "10.18653/v1/2021.emnlp-demo.17",
	pages = "143--152",
	}
	```