nectec
/

Pathumma-whisper-th-large-v3

Automatic Speech Recognition

Model card Files Files and versions Community

PATTARA TIPAKSORN commited on Oct 24, 2024

Commit

d90bc77

·

verified ·

1 Parent(s): 66d2133

Update README.md

Files changed (1) hide show

README.md +22 -1

README.md CHANGED Viewed

@@ -18,7 +18,28 @@ More information needed
 More information needed
 ## Quickstart
-More information needed
 ## Evaluation Performance
 Note: WER calculated with newmm tokenizer for Thai segmentation.

 More information needed
 ## Quickstart
+You can transcribe audio files using the [`pipeline`](https://huggingface.co/docs/transformers/main_classes/pipelines#transformers.AutomaticSpeechRecognitionPipeline) class with the following code snippet:
+```python
+import torch
+from transformers import pipeline
+device = "cuda" if torch.cuda.is_available() else "cpu"
+torch_dtype = torch.bfloat16 if torch.cuda.is_available() else torch.float32
+lang = "th"
+task = "transcribe"
+pipe = pipeline(
+    task="automatic-speech-recognition",
+    model="nectec/Pathumma-whisper-th-large-v3",
+    torch_dtype=torch_dtype,
+    device=device,
+)
+pipe.model.config.forced_decoder_ids = pipe.tokenizer.get_decoder_prompt_ids(language=lang, task=task)
+text = pipe("audio_path.wav")["text"]
+print(text)
+```
 ## Evaluation Performance
 Note: WER calculated with newmm tokenizer for Thai segmentation.