File size: 952 Bytes
ccf210d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
---
license: apache-2.0
language:
- tw
- ak
library_name: pytorch
tags:
- speechless
- rvq
- whisper
- twi
- akan
- vector-quantization
- semantic-tokens
---

# Speechless TWI — Stage 1 (RVQ for Whisper Encoder)

Trained RVQ that discretizes Whisper encoder features into semantic tokens for **Twi/Akan**.

## Files
- `rvq_final.pt` — state dict
- `config_stage1.json` — training/config params
- `rvq_wrapper.py` — tiny module defining `RVQWrapper`

## Usage (example)
```python
import torch, json
from huggingface_hub import hf_hub_download
from rvq_wrapper import RVQWrapper

cfg = json.load(open(hf_hub_download("ik/speechless-twi-stage1-rvq-whisper-medium", "config_stage1.json"), "r"))
ckpt = torch.load(hf_hub_download("ik/speechless-twi-stage1-rvq-whisper-medium", "rvq_final.pt"), map_location="cpu")

rvq = RVQWrapper(cfg["rvq_dim"], cfg["rvq_num_quantizers"], cfg["rvq_codebook_size"])
rvq.load_state_dict(ckpt["rvq"])
rvq.eval()
```