open-universe: Generative Speech Enhancement with Score-based Diffusion and Adversarial Training

This repository contains the configurations and weights for the UNIVERSE++ and UNIVERSE models implemented in open-universe.

The models were trained on the Voicebank-DEMAND dataset at 16 kHz.

The performance on the test split of Voicebank-DEMAND is given in the following table.

model	si-sdr	pesq-wb	stoi-ext	lsd	lps	OVRL	SIG	BAK
UNIVERSE++	18.624	3.017	0.864	4.867	0.937	3.200	3.489	4.040
UNIVERSE	17.600	2.830	0.844	6.318	0.920	3.157	3.457	4.013

Usage

Start by installing open-universe. We use conda to simplify the installation.

git clone https://github.com/line/open-universe.git
cd open-universe
conda env create -f environment.yaml
conda activate open-universe
python -m pip install .

Then the models can be used as follows.

# UNIVERSE++ (default model)
python -m open_universe.bin.enhance <input/folder> <output/folder> \
  --model line-corporation/open-universe:plusplus

# UNIVERSE
python -m open_universe.bin.enhance <input/folder> <output/folder> \
  --model line-corporation/open-universe:original

Referencing open-universe and UNIVERSE++

If you use these models in your work, please consider citing the following paper.

@inproceedings{universepp,
    authors={Scheibler, Robin and Fujita, Yusuke and Shirahata, Yuma and Komatsu, Tatsuya},
    title={Universal Score-based Speech Enhancement with High Content Preservation},
    booktitle={Proc. Interspeech 2024},
    month=sep,
    year=2024
}

Referencing UNIVERSE

@misc{universe,
    authors={Serr\'a, Joan and Santiago, Pascual and Pons, Jordi and Araz, Oguz R. and Scaini, David},
    title={Universal Speech Enhancement with Score-based Diffusion},
    howpublished={arXiv:2206.03065},
    month=sep,
    year=2022
}

line-corporation
/

open-universe

open-universe: Generative Speech Enhancement with Score-based Diffusion and Adversarial Training

Usage

Referencing open-universe and UNIVERSE++

Referencing UNIVERSE

Evaluation results