Whisper model finetuned using audio data from Open STT Russian Dataset (https://github.com/snakers4/open_stt).
There is a differences in tokenization of source data (in our data normalization process, we replace punctucation with "" rather than Whisper's " "). This mismatch leads to a slight degradation on CommonVoice.
- Downloads last month
- 32
Inference API (serverless) is not available, repository is disabled.
Evaluation results
- WER on mozilla-foundation/common_voice_11_0test set self-reported9.650