waveletdeboshir commited on
Commit
412ecdc
·
verified ·
1 Parent(s): 371fd9d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +52 -3
README.md CHANGED
@@ -1,3 +1,52 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - ru
5
+ pipeline_tag: automatic-speech-recognition
6
+ library_name: transformers
7
+ tags:
8
+ - asr
9
+ ---
10
+
11
+ # GigaAMv2-CTC Hugging Face transformers
12
+
13
+ * original git https://github.com/salute-developers/GigaAM
14
+
15
+ Russian ASR model
16
+
17
+ ## Model info
18
+ This is an original GigaAMv2-CTC with `transformers` library interface.
19
+
20
+ File `gigaam_transformers.py` contains model, feature extractor and tokenizer classes with usual transformers methods.
21
+
22
+ Jupyter `GigaAMHFTrain.ipynb` contains training pipeline with `transformers`.
23
+
24
+ ## Usage
25
+ Usage is same as for other `transformers` asr models.
26
+
27
+ ```python
28
+ >>> from gigaam_transformers import GigaAMCTCHF, GigaAMProcessor
29
+ >>> import torchaudio
30
+
31
+ >>> # load audio
32
+ >>> wav, sr = torchaudio.load("audio.wav")
33
+ >>> # resample if necessary
34
+ >>> wav = torchaudio.functional.resample(wav, sr, 16000)
35
+
36
+ >>> # load model and processor
37
+ >>> processor = GigaAMProcessor.from_pretrained("waveletdeboshir/gigaam-ctc")
38
+ >>> model = GigaAMCTCHF.from_pretrained("waveletdeboshir/gigaam-ctc")
39
+
40
+ >>> input_features = processor(wav[0], sampling_rate=16000, return_tensors="pt")
41
+
42
+ >>> # predict
43
+ >>> pred = model(input_features)
44
+ >>> # greedy decoding
45
+ >>> greedy_ids = pred.predictions.argmax(dim=-1)
46
+ >>> # decode token ids to text
47
+ >>> transcription = processor.batch_decode(greedy_ids)
48
+
49
+
50
+ ```
51
+
52
+ ## Finetune