---
license: mit
language:
- ru
pipeline_tag: automatic-speech-recognition
library_name: transformers
tags:
- asr
---

# GigaAMv2-CTC Hugging Face transformers

* original git https://github.com/salute-developers/GigaAM

Russian ASR model

## Model info
This is an original GigaAMv2-CTC with `transformers` library interface.

File `gigaam_transformers.py` contains model, feature extractor and tokenizer classes with usual transformers methods.

Jupyter `GigaAMHFTrain.ipynb` contains training pipeline with `transformers`.

## Usage
Usage is same as for other `transformers` asr models.

```python
>>> from gigaam_transformers import GigaAMCTCHF, GigaAMProcessor
>>> import torchaudio

>>> # load audio
>>> wav, sr = torchaudio.load("audio.wav")
>>> # resample if necessary
>>> wav = torchaudio.functional.resample(wav, sr, 16000)

>>> # load model and processor
>>> processor = GigaAMProcessor.from_pretrained("waveletdeboshir/gigaam-ctc")
>>> model = GigaAMCTCHF.from_pretrained("waveletdeboshir/gigaam-ctc")

>>> input_features = processor(wav[0], sampling_rate=16000, return_tensors="pt")

>>> # predict
>>> pred = model(input_features)
>>> # greedy decoding
>>> greedy_ids = pred.predictions.argmax(dim=-1)
>>> # decode token ids to text
>>> transcription = processor.batch_decode(greedy_ids)


```

## Finetune