Commit
·
863a64d
1
Parent(s):
cb7ac4f
Update README.md
Browse files
README.md
CHANGED
@@ -1,5 +1,35 @@
|
|
1 |
-
|
2 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
3 |
|
4 |
## Usage
|
5 |
The model can be used directly (without a language model) as follows, assuming you have a dataset with Marathi text and audio_path fields:
|
@@ -51,7 +81,7 @@ processor = Wav2Vec2Processor.from_pretrained("gchhablani/wav2vec2-large-xlsr-mr
|
|
51 |
model = Wav2Vec2ForCTC.from_pretrained("gchhablani/wav2vec2-large-xlsr-mr-3")
|
52 |
model.to("cuda")
|
53 |
|
54 |
-
chars_to_ignore_regex = '[
|
55 |
|
56 |
|
57 |
# Preprocessing the datasets.
|
|
|
1 |
+
---
|
2 |
+
language: mr
|
3 |
+
datasets:
|
4 |
+
- openslr
|
5 |
+
- interspeech_2021_asr
|
6 |
+
metrics:
|
7 |
+
- wer
|
8 |
+
tags:
|
9 |
+
- audio
|
10 |
+
- automatic-speech-recognition
|
11 |
+
- speech
|
12 |
+
- xlsr-fine-tuning-week
|
13 |
+
- hindi
|
14 |
+
- marathi
|
15 |
+
license: apache-2.0
|
16 |
+
model-index:
|
17 |
+
- name: XLSR Wav2Vec2 Large 53 Hindi-Marathi by Tanmay Laud
|
18 |
+
results:
|
19 |
+
- task:
|
20 |
+
name: Speech Recognition
|
21 |
+
type: automatic-speech-recognition
|
22 |
+
dataset:
|
23 |
+
name: OpenSLR hi, OpenSLR mr
|
24 |
+
type: openslr, interspeech_2021_asr
|
25 |
+
metrics:
|
26 |
+
- name: Test WER
|
27 |
+
type: wer
|
28 |
+
value: 60.80
|
29 |
+
---
|
30 |
+
|
31 |
+
# Wav2Vec2-Large-XLSR-53-Hindi-Marathi
|
32 |
+
### Fine-tuned facebook/wav2vec2-large-xlsr-53 on Hindi and Marathi using the OpenSLR SLR64 datasets. Note that this data OpenSLR contains only female voices. Please keep this in mind before using the model for your task. When using this model, make sure that your speech input is sampled at 16kHz.
|
33 |
|
34 |
## Usage
|
35 |
The model can be used directly (without a language model) as follows, assuming you have a dataset with Marathi text and audio_path fields:
|
|
|
81 |
model = Wav2Vec2ForCTC.from_pretrained("gchhablani/wav2vec2-large-xlsr-mr-3")
|
82 |
model.to("cuda")
|
83 |
|
84 |
+
chars_to_ignore_regex = '[\\,\\?\\.\\!\\-\\;\\:\\"\\“\\%\\‘\\”\\�\\–\\…]'
|
85 |
|
86 |
|
87 |
# Preprocessing the datasets.
|