Update README.md
Browse files
README.md
CHANGED
@@ -216,40 +216,75 @@ Stage 1 employed low threshold values (0 to 0.30 BLEU depending on dataset), whe
|
|
216 |
| NST | 250 | 250 |
|
217 |
| **Total** | **56,514** | **8,533** |
|
218 |
|
219 |
-
The default when loading our models through Hugging Face is **Stage 2**. We have however also uploaded continued pretraining checkpoints and tagged them. You can load these other checkpoints by specifying the `revision` in `.from_pretrained()`. The pretrained checkpoints tag can for example be found here: [`pretrained-checkpoint`](https://huggingface.co/KBLab/kb-whisper-
|
220 |
|
221 |
### Evaluation
|
222 |
|
223 |
|
224 |
-
#### WER
|
225 |
| Model size | | FLEURS | CommonVoice | NST |
|
226 |
|------------|---------|--------|-------------|------|
|
227 |
| [tiny](https://huggingface.co/KBLab/kb-whisper-tiny) | **KBLab** | **13.2** | **12.9** | **11.2** |
|
228 |
| | OpenAI | 59.2 | 67.8 | 85.2 |
|
229 |
| [base](https://huggingface.co/KBLab/kb-whisper-base) | **KBLab** | **9.1** | **8.7** | **7.8** |
|
230 |
| | OpenAI | 39.6 | 52.1 | 53.4 |
|
231 |
-
| [small](https://huggingface.co/KBLab/kb-whisper-small)
|
232 |
| | OpenAI | 20.6 | 26.4 | 26.4 |
|
233 |
-
| [medium](https://huggingface.co/KBLab/kb-whisper-medium)
|
234 |
| | OpenAI | 12.1 | 15.8 | 17.1 |
|
235 |
-
| [large-v3](https://huggingface.co/KBLab/kb-whisper-large)
|
236 |
| | OpenAI | 7.8 | 9.5 | 11.3 |
|
237 |
|
|
|
238 |
|
239 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
240 |
| Model size | | FLEURS | CommonVoice | NST |
|
241 |
|------------|---------|--------|-------------|------|
|
242 |
-
| tiny | KBLab | **76.6** | **73.7** | **74.3** |
|
243 |
| | OpenAI | 26.9 | 21.1 | 24.0 |
|
244 |
-
| base | KBLab | **83.2** | **79.9** | **78.3** |
|
245 |
| | OpenAI | 41.1 | 32.5 | 36.9 |
|
246 |
-
| small
|
247 |
| | OpenAI | 64.0 | 56.5 | 58.2 |
|
248 |
-
| medium
|
249 |
| | OpenAI | 77.1 | 70.1 | 68.9 |
|
250 |
-
| large-v3
|
251 |
| | OpenAI | 84.9 | 79.1 | 75.1 |
|
252 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
253 |
|
254 |
### Acknowledgements
|
255 |
|
|
|
216 |
| NST | 250 | 250 |
|
217 |
| **Total** | **56,514** | **8,533** |
|
218 |
|
219 |
+
The default when loading our models through Hugging Face is **Stage 2**. We have however also uploaded continued pretraining checkpoints and tagged them. You can load these other checkpoints by specifying the `revision` in `.from_pretrained()`. The pretrained checkpoints tag can for example be found here: [`pretrained-checkpoint`](https://huggingface.co/KBLab/kb-whisper-large/tree/pretrained-checkpoint). The Stage 2 default model tag is named `standard`. We supply two different stage 2 checkpoints -- one with a more condensed style of transcribing -- under the name `subtitle`, and one more verbose called `strict`.
|
220 |
|
221 |
### Evaluation
|
222 |
|
223 |
|
224 |
+
#### WER compared to OpenAI
|
225 |
| Model size | | FLEURS | CommonVoice | NST |
|
226 |
|------------|---------|--------|-------------|------|
|
227 |
| [tiny](https://huggingface.co/KBLab/kb-whisper-tiny) | **KBLab** | **13.2** | **12.9** | **11.2** |
|
228 |
| | OpenAI | 59.2 | 67.8 | 85.2 |
|
229 |
| [base](https://huggingface.co/KBLab/kb-whisper-base) | **KBLab** | **9.1** | **8.7** | **7.8** |
|
230 |
| | OpenAI | 39.6 | 52.1 | 53.4 |
|
231 |
+
| [small](https://huggingface.co/KBLab/kb-whisper-small) | **KBLab** | **7.3** | **6.4** | **6.6** |
|
232 |
| | OpenAI | 20.6 | 26.4 | 26.4 |
|
233 |
+
| [medium](https://huggingface.co/KBLab/kb-whisper-medium) | **KBLab** | **6.6** | **5.4** | **5.8** |
|
234 |
| | OpenAI | 12.1 | 15.8 | 17.1 |
|
235 |
+
| [large-v3](https://huggingface.co/KBLab/kb-whisper-large) | **KBLab** | **5.4** | **4.1** | **5.2** |
|
236 |
| | OpenAI | 7.8 | 9.5 | 11.3 |
|
237 |
|
238 |
+
#### WER for different KBLab stage2 versions
|
239 |
|
240 |
+
| Model size | | FLEURS | CommonVoice | NST |
|
241 |
+
|------------|---------|--------|-------------|------|
|
242 |
+
| [tiny](https://huggingface.co/KBLab/kb-whisper-tiny) | **standard** | **13.2** | **12.9** | **11.2** |
|
243 |
+
| | strict | 14.1 | 13.4 | 11.0 |
|
244 |
+
| | subtitle | 13.3 | 12.9 | 11.4 |
|
245 |
+
| [base](https://huggingface.co/KBLab/kb-whisper-base) | **standard** | **9.1** | **8.7** | **7.8** |
|
246 |
+
| | strict | 10.4 | 9.6 | 8.4 |
|
247 |
+
| | subtitle | 9.1 | 8.7 | 7.9 |
|
248 |
+
| [small](https://huggingface.co/KBLab/kb-whisper-small) | **standard** | **7.3** | **6.4** | **6.6** |
|
249 |
+
| | strict | 8.2 | 7.0 | 6.7 |
|
250 |
+
| | subtitle | 7.3 | 6.4 | 6.6 |
|
251 |
+
| [medium](https://huggingface.co/KBLab/kb-whisper-medium) | **standard** | **6.6** | **5.4** | **5.8** |
|
252 |
+
| | strict | 6.8 | 5.4 | 6.0 |
|
253 |
+
| [large-v3](https://huggingface.co/KBLab/kb-whisper-large) | **standard** | **5.4** | **4.1** | **5.2** |
|
254 |
+
| | strict | 5.3 | 4.0 | 5.1 |
|
255 |
+
|
256 |
+
#### BLEU Score compared to OpenAI
|
257 |
| Model size | | FLEURS | CommonVoice | NST |
|
258 |
|------------|---------|--------|-------------|------|
|
259 |
+
| [tiny](https://huggingface.co/KBLab/kb-whisper-tiny) | **KBLab** | **76.6** | **73.7** | **74.3** |
|
260 |
| | OpenAI | 26.9 | 21.1 | 24.0 |
|
261 |
+
| [base](https://huggingface.co/KBLab/kb-whisper-base) | **KBLab** | **83.2** | **79.9** | **78.3** |
|
262 |
| | OpenAI | 41.1 | 32.5 | 36.9 |
|
263 |
+
| [small](https://huggingface.co/KBLab/kb-whisper-small) | **KBLab** | **86.6** | **83.5** | **79.6** |
|
264 |
| | OpenAI | 64.0 | 56.5 | 58.2 |
|
265 |
+
| [medium](https://huggingface.co/KBLab/kb-whisper-medium) | **KBLab** | **87.6** | **85.0** | **80.2** |
|
266 |
| | OpenAI | 77.1 | 70.1 | 68.9 |
|
267 |
+
| [large-v3](https://huggingface.co/KBLab/kb-whisper-large) | **KBLab** | **89.8** | **87.2** | **81.1** |
|
268 |
| | OpenAI | 84.9 | 79.1 | 75.1 |
|
269 |
|
270 |
+
#### BLEU Score for different KBLab stage2 versions
|
271 |
+
| Model size | | FLEURS | CommonVoice | NST |
|
272 |
+
|------------|---------|--------|-------------|------|
|
273 |
+
| [tiny](https://huggingface.co/KBLab/kb-whisper-tiny) | **standard** | **76.6** | **73.7** | **74.3** |
|
274 |
+
| | strict | 75.3 | 72.9 | 74.6 |
|
275 |
+
| | subtitle | 76.6 | 73.7 | 74.1 |
|
276 |
+
| [base](https://huggingface.co/KBLab/kb-whisper-base) | **standard** | **83.2** | **79.9** | **78.3** |
|
277 |
+
| | strict | 81.0 | 78.4 | 77.5 |
|
278 |
+
| | subtitle | 83.2 | 79.8 | 78.2 |
|
279 |
+
| [small](https://huggingface.co/KBLab/kb-whisper-small) | **standard** | **86.6** | **83.5** | **79.6** |
|
280 |
+
| | strict | 84.9 | 82.4 | 79.3 |
|
281 |
+
| | subtitle | 86.6 | 83.5 | 79.6 |
|
282 |
+
| [medium](https://huggingface.co/KBLab/kb-whisper-medium) | **standard** | **87.6** | **85.0** | **80.2** |
|
283 |
+
| | strict | 87.3 | 84.9 | 80.1 |
|
284 |
+
| [large-v3](https://huggingface.co/KBLab/kb-whisper-large) | **standard** | **89.8** | **87.2** | **81.1** |
|
285 |
+
| | strict | 90.0 | 87.4 | 81.2 |
|
286 |
+
|
287 |
+
|
288 |
|
289 |
### Acknowledgements
|
290 |
|