ik's picture
Upload RVQ Stage-1 (2025-09-02)
ccf210d verified
metadata
license: apache-2.0
language:
  - tw
  - ak
library_name: pytorch
tags:
  - speechless
  - rvq
  - whisper
  - twi
  - akan
  - vector-quantization
  - semantic-tokens

Speechless TWI — Stage 1 (RVQ for Whisper Encoder)

Trained RVQ that discretizes Whisper encoder features into semantic tokens for Twi/Akan.

Files

  • rvq_final.pt — state dict
  • config_stage1.json — training/config params
  • rvq_wrapper.py — tiny module defining RVQWrapper

Usage (example)

import torch, json
from huggingface_hub import hf_hub_download
from rvq_wrapper import RVQWrapper

cfg = json.load(open(hf_hub_download("ik/speechless-twi-stage1-rvq-whisper-medium", "config_stage1.json"), "r"))
ckpt = torch.load(hf_hub_download("ik/speechless-twi-stage1-rvq-whisper-medium", "rvq_final.pt"), map_location="cpu")

rvq = RVQWrapper(cfg["rvq_dim"], cfg["rvq_num_quantizers"], cfg["rvq_codebook_size"])
rvq.load_state_dict(ckpt["rvq"])
rvq.eval()