-
japanese-asr/en2ja.s2t_translation
Viewer • Updated • 32k • 34 • 2 -
japanese-asr/ja2en.s2t_translation
Viewer • Updated • 2.24k • 46 • 1 -
japanese-asr/ja-cascaded-s2t-translation
Automatic Speech Recognition • Updated • 114 • 2 -
japanese-asr/en-cascaded-s2t-translation
Automatic Speech Recognition • Updated • 42 • 1
Japanese ASR
AI & ML interests
This repo contains models and datasets for Japanese ASR. See our main model https://huggingface.co/kotoba-tech/kotoba-whisper-v1.0.
Japanese ASR
This repository contains all the models and datasets for train/evaluate the Japanese ASR dataset generated through the process of achieving kotoba-whisper models.
Following table shows CER comparison with different data size of ReazonSpeech used to distill openai/whisper-large-v3. The model names follows
japanese-asr/distil-whisper-large-v3-ja-reazonspeech-{size of reazonspeech}
.
CER
model | CommonVoice 8 (Japanese test set) | JSUT Basic 5000 | ReazonSpeech (held out test set) |
---|---|---|---|
japanese-asr/distil-whisper-large-v3-ja-reazonspeech-all | 9.2 | 8.4 | 11.6 |
japanese-asr/distil-whisper-large-v3-ja-reazonspeech-large | 9.4 | 8.5 | 12.2 |
japanese-asr/distil-whisper-large-v3-ja-reazonspeech-medium | 10.9 | 11.3 | 14.8 |
japanese-asr/distil-whisper-large-v3-ja-reazonspeech-small | 30.2 | 39 | 40.7 |
japanese-asr/distil-whisper-large-v3-ja-reazonspeech-tiny | 94.8 | 96.3 | 96.7 |
openai/whisper-large-v3 | 8.5 | 7.1 | 14.9 |
openai/whisper-large-v2 | 9.7 | 8.2 | 28.1 |
openai/whisper-large | 10 | 8.9 | 34.1 |
openai/whisper-medium | 11.5 | 10 | 33.2 |
openai/whisper-base | 28.6 | 24.9 | 70.4 |
openai/whisper-small | 15.1 | 14.2 | 41.5 |
openai/whisper-tiny | 53.7 | 36.5 | 137.9 |
reazon-research/reazonspeech-nemo-v2 | 9.1 | 7.4 | 11.2 |
WER
model | CommonVoice 8 (Japanese test set) | JSUT Basic 5000 | ReazonSpeech (held out test set) |
---|---|---|---|
japanese-asr/distil-whisper-large-v3-ja-reazonspeech-all | 58.8 | 63.7 | 55.6 |
japanese-asr/distil-whisper-large-v3-ja-reazonspeech-large | 59.2 | 64.3 | 56.4 |
japanese-asr/distil-whisper-large-v3-ja-reazonspeech-medium | 64.6 | 72.1 | 63 |
japanese-asr/distil-whisper-large-v3-ja-reazonspeech-small | 85 | 94.2 | 82.1 |
japanese-asr/distil-whisper-large-v3-ja-reazonspeech-tiny | 100 | 100 | 99 |
openai/whisper-large-v3 | 55.1 | 59.2 | 60.2 |
openai/whisper-large-v2 | 59.3 | 63.2 | 74.1 |
openai/whisper-large | 61.1 | 66.4 | 74.9 |
openai/whisper-medium | 63.4 | 69.5 | 76 |
openai/whisper-base | 87.2 | 93 | 91.8 |
openai/whisper-small | 74.2 | 81.9 | 83 |
openai/whisper-tiny | 93.8 | 97.6 | 94.9 |
reazon-research/reazonspeech-nemo-v2 | 57.5 | 60.6 | 47.5 |
Note that kotoba-tech/kotoba-whisper-v1.0 is an alias of japanese-asr/distil-whisper-large-v3-ja-reazonspeech-large and kotoba-tech/kotoba-whisper-v2.0 is an alias of japanese-asr/distil-whisper-large-v3-ja-reazonspeech-all.
Please find more detailed results at kotoba-whisper codebase.
Collections
7
-
japanese-asr/en_asr.mls
Viewer • Updated • 10.4M • 1.58k • 2 -
japanese-asr/whisper_transcriptions.mls
Viewer • Updated • 10.4M • 219 • 1 -
japanese-asr/whisper_transcriptions.mls.wer_10.0
Viewer • Updated • 9.33M • 1.08k • 1 -
japanese-asr/whisper_transcriptions.mls.wer_10.0.vectorized
Viewer • Updated • 7.44M • 1.72k • 1