SPRINGLab
/

EZ-VC

Model card Files Files and versions

EZ-VC / README.md

rumourscape's picture

Update README.md

9e40bce verified about 1 month ago

|

history blame contribute delete

2.52 kB

	---
	license: cc-by-nc-4.0
	pipeline_tag: audio-to-audio
	library_name: f5-tts
	extra_gated_prompt: "You agree to not use the model to generate, share, or promote content that is illegal, harmful, deceptive, or intended to impersonate real individuals without their informed consent."
	extra_gated_fields:
	Affiliation: text
	Country: country
	I agree to use this model for non-commercial use ONLY: checkbox
	---

	# EZ-VC: Easy Zero-shot Any-to-Any Voice Conversion

	[![github](https://img.shields.io/badge/Github-code-brightgreen)](https://github.com/EZ-VC/EZ-VC)
	[![arXiv](https://img.shields.io/badge/arXiv-2505.16691-b31b1b.svg?logo=arXiv)](https://arxiv.org/abs/2505.16691)
	[![demo](https://img.shields.io/badge/Demo-page-yellow.svg)](https://ez-vc.github.io/EZ-VC-Demo/)
	[![lab](https://img.shields.io/badge/SPRING-Lab-purple)](https://asr.iitm.ac.in/)
	<!-- <img src="https://github.com/user-attachments/assets/12d7749c-071a-427c-81bf-b87b91def670" alt="Watermark" style="width: 40px; height: auto"> -->


	### Our paper has been accepted to the Findings of EMNLP 2025!

	## Installation

	### Create a separate environment if needed

	```bash
	# Create a python 3.10 conda env (you could also use virtualenv)
	conda create -n ez-vc python=3.10
	conda activate ez-vc
	```

	### Local installation

	```bash
	git clone https://github.com/EZ-VC/EZ-VC
	cd EZ-VC
	git submodule update --init --recursive
	pip install -e .

	# Install espnet for xeus (Exactly this version)
	pip install 'espnet @ git+https://github.com/wanchichen/espnet.git@ssl'
	```

	## Inference

	We have provided a Jupyter notebook for inference in "src/f5_tts/infer/infer.ipynb".

	Open [Inference notebook](https://github.com/EZ-VC/EZ-VC/blob/main/src/f5_tts/infer/infer.ipynb).

	Run all.

	The converted audio will be available at the last cell.


	## Acknowledgements

	- [F5-TTS](https://arxiv.org/abs/2410.06885) for opensourcing their code which has made EZ-VC possible.

	## Citation
	If our work and codebase is useful for you, please cite as:
	```
	@misc{joglekar2025ezvceasyzeroshotanytoany,
	title={EZ-VC: Easy Zero-shot Any-to-Any Voice Conversion},
	author={Advait Joglekar and Divyanshu Singh and Rooshil Rohit Bhatia and S. Umesh},
	year={2025},
	eprint={2505.16691},
	archivePrefix={arXiv},
	primaryClass={cs.SD},
	url={https://arxiv.org/abs/2505.16691},
	}
	```
	## License

	Our code is released under MIT License. The pre-trained models are licensed under the CC-BY-NC license. Sorry for any inconvenience this may cause.