File size: 874 Bytes
7fed7ad df6f32c 7fed7ad c53dbfb babd1b7 5b915ca 7fed7ad |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 |
---
license: mit
language:
- en
pipeline_tag: text-to-speech
library_name: vui
---
# vui
[DEMO](https://fluxions.ai)
https://github.com/fluxions-ai/vui
Small Conversational speech models that can run on device
# Installation
```sh
uv pip install -e .
```
# Demo
```sh
python demo.py
````
# Models
Vui.BASE is base checkpoint trained on 40k hours of audio conversations
Vui.ABRAHAM is a single speaker model that can reply with context awareness.
Vui.COHOST is checkpoint with two speakers that can talk to each other.
# Voice Cloning
You can clone with the base model quite well but it's not perfect as hasn't seen that much audio / wasn't trained for long
# FAQ
1) Was developed with on two 4090's https://x.com/harrycblum/status/1752698806184063153
2) Hallucinations: yes the model does hallucinate, but this is the best I could do with limited resources! :( |