Update README.md
Browse files
README.md
CHANGED
@@ -10,7 +10,7 @@ datasets:
|
|
10 |
- gathered from Uberduck's discord server, put together by Crust.
|
11 |
---
|
12 |
|
13 |
-
# **CRUST** (Chungus Related Uberduck's Speech toy)
|
14 |
# Welcome to Crust 🍕⭕
|
15 |
|
16 |
Crust is a 168 speaker model based on uberduck's pipeline. We've noticed that having multiple speakers instead of having one speaker, improves the performance of the model and makes it be able to synthesize comparable results with only 1 minute of data. The results are surprisingly good and because of the lower dataset, batch size can be lowered and the model is generally faster than other models.
|
@@ -45,4 +45,6 @@ Even though this model can be trained on 1 minute of data, we still recommend tr
|
|
45 |
|
46 |
Sadly, the model has only been trained on 22050 hz and mono audio files, while this still sounds good when there's a Hi-Fi Gan vocoder, It's still going to not have stereo sound (which would not be that useful) or 44100 hz audio quality on its own. Sadly the Hi-Fi Gan vocoder does also bring in artifacts into the wav files which makes synthesis not as realistic.
|
47 |
|
48 |
-
We used [**Uberduck's TTS Pipeline on github**](https://github.com/uberduck-ai/uberduck-ml-dev) To train our model.
|
|
|
|
|
|
10 |
- gathered from Uberduck's discord server, put together by Crust.
|
11 |
---
|
12 |
|
13 |
+
# **CRUST - UNRELEASED** (Chungus Related Uberduck's Speech toy)
|
14 |
# Welcome to Crust 🍕⭕
|
15 |
|
16 |
Crust is a 168 speaker model based on uberduck's pipeline. We've noticed that having multiple speakers instead of having one speaker, improves the performance of the model and makes it be able to synthesize comparable results with only 1 minute of data. The results are surprisingly good and because of the lower dataset, batch size can be lowered and the model is generally faster than other models.
|
|
|
45 |
|
46 |
Sadly, the model has only been trained on 22050 hz and mono audio files, while this still sounds good when there's a Hi-Fi Gan vocoder, It's still going to not have stereo sound (which would not be that useful) or 44100 hz audio quality on its own. Sadly the Hi-Fi Gan vocoder does also bring in artifacts into the wav files which makes synthesis not as realistic.
|
47 |
|
48 |
+
We used [**Uberduck's TTS Pipeline on github**](https://github.com/uberduck-ai/uberduck-ml-dev) To train our model.
|
49 |
+
|
50 |
+
**You can't yet download the model nor see any results, the model is still in its training phase and should soon be released**
|