Update README.md
Browse files
README.md
CHANGED
@@ -94,6 +94,8 @@ We evaluated the speculative decoding setup for Whisper-large-v3-singlish on the
|
|
94 |
|
95 |
- [AMI](https://huggingface.co/datasets/edinburghcstr/ami): A widely used dataset for meeting transcription and diarization tasks. This work specifically uses the IHM (Individual Headset Microphone) recordings.
|
96 |
|
|
|
|
|
97 |
### Model Performance
|
98 |
|
99 |
| **Dataset** | **Model Variant** | **Link** | **Rel. RTFx** | **WER** |
|
@@ -105,17 +107,22 @@ We evaluated the speculative decoding setup for Whisper-large-v3-singlish on the
|
|
105 |
| AMI | Large | [Whisper-large-v3-singlish](https://huggingface.co/mjwong/whisper-large-v3-singlish) | 1.00 | 23.72% |
|
106 |
| AMI | Large-Turbo | [Whisper-large-v3-turbo-singlish](https://huggingface.co/mjwong/whisper-large-v3-turbo-singlish) | 1.53 | **16.99%** |
|
107 |
| AMI | Draft-enhanced Large | Whisper-large-v3-singlish + [DRAFT](https://huggingface.co/mjwong/whisper-large-v3-singlish-DRAFT) | **2.27** | 22.06% |
|
|
|
|
|
|
|
|
|
108 |
|
109 |
### Speculative Acceptance Rates (DRAFT-enhanced Large Model)
|
110 |
|
111 |
-
| **Dataset** | **Micro Avg Acceptance** | **Macro Avg Acceptance**
|
112 |
|----------------|--------------------------|---------------------------|
|
113 |
| SASRBench-v1 | 38.00% | 42.00% |
|
114 |
| AMI | 38.00% | 43.00% |
|
|
|
115 |
|
116 |
### Conclusion
|
117 |
|
118 |
-
While it does not outperform Large-Turbo in WER, the Draft-enhanced Large model demonstrates strong speculative acceptance rates (~
|
119 |
|
120 |
## Disclaimer
|
121 |
|
|
|
94 |
|
95 |
- [AMI](https://huggingface.co/datasets/edinburghcstr/ami): A widely used dataset for meeting transcription and diarization tasks. This work specifically uses the IHM (Individual Headset Microphone) recordings.
|
96 |
|
97 |
+
- [GigaSpeech](https://huggingface.co/datasets/speechcolab/gigaspeech): A large-scale open-source dataset with diverse English audio, covering read, conversational, and spontaneous speech.
|
98 |
+
|
99 |
### Model Performance
|
100 |
|
101 |
| **Dataset** | **Model Variant** | **Link** | **Rel. RTFx** | **WER** |
|
|
|
107 |
| AMI | Large | [Whisper-large-v3-singlish](https://huggingface.co/mjwong/whisper-large-v3-singlish) | 1.00 | 23.72% |
|
108 |
| AMI | Large-Turbo | [Whisper-large-v3-turbo-singlish](https://huggingface.co/mjwong/whisper-large-v3-turbo-singlish) | 1.53 | **16.99%** |
|
109 |
| AMI | Draft-enhanced Large | Whisper-large-v3-singlish + [DRAFT](https://huggingface.co/mjwong/whisper-large-v3-singlish-DRAFT) | **2.27** | 22.06% |
|
110 |
+
||||||
|
111 |
+
| GigaSpeech | Large | [Whisper-large-v3-singlish](https://huggingface.co/mjwong/whisper-large-v3-singlish) | 1.00 | 13.15% |
|
112 |
+
| GigaSpeech | Large-Turbo | [Whisper-large-v3-turbo-singlish](https://huggingface.co/mjwong/whisper-large-v3-turbo-singlish) | 1.95 | **11.54%** |
|
113 |
+
| GigaSpeech | Draft-enhanced Large | Whisper-large-v3-singlish + [DRAFT](https://huggingface.co/mjwong/whisper-large-v3-singlish-DRAFT) | **2.37** | 12.81% |
|
114 |
|
115 |
### Speculative Acceptance Rates (DRAFT-enhanced Large Model)
|
116 |
|
117 |
+
| **Dataset** | **Micro Avg Acceptance** | **Macro Avg Acceptance** |
|
118 |
|----------------|--------------------------|---------------------------|
|
119 |
| SASRBench-v1 | 38.00% | 42.00% |
|
120 |
| AMI | 38.00% | 43.00% |
|
121 |
+
| GigaSpeech | 31.00% | 37.00% |
|
122 |
|
123 |
### Conclusion
|
124 |
|
125 |
+
While it does not outperform Large-Turbo in WER, the Draft-enhanced Large model demonstrates strong speculative acceptance rates (~31–43%), indicating meaningful potential for runtime gains through early prediction acceptance. In latency-sensitive applications, it offers a compelling middle ground between the high accuracy of Large-Turbo and the slower inference of standard decoding.
|
126 |
|
127 |
## Disclaimer
|
128 |
|