Whisper Mongolian ASR Model
This is a custom-trained Whisper model for Mongolian speech recognition, based on a custom implementation of Whisper.
Model Details
- Architecture: Custom Whisper-like model trained from scratch
- Training Data: Mozilla Common Voice Mongolian dataset
- Performance Metrics:
- Word Error Rate (WER): 0.9277985118418891
- Character Error Rate (CER): 0.7262371117301725
Usage
This model can be used in two ways:
1. Using the compatibility wrapper:
from transformers import pipeline
import torch
device = "cuda" if torch.cuda.is_available() else "cpu"
transcriber = pipeline("automatic-speech-recognition",
model="Nasanbuyan/whisper-mongolian",
device=device)
# Transcribe audio
result = transcriber("path/to/audio.mp3")
print(result["text"])
2. Using the original implementation:
import torch
from whisper-mongolian.whisper_model import WhisperModel
# Load the model
model = WhisperModel("Nasanbuyan/whisper-mongolian", device="cpu")
# Transcribe audio
segments, info = model.transcribe("path/to/audio.mp3")
transcription = " ".join([segment.text for segment in segments])
print(transcription)
Citation
If you use this model, please cite:
@misc{whisper-mongolian,
author = {Your Name},
title = {Whisper Mongolian ASR Model},
year = {2025},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/Nasanbuyan/whisper-mongolian}}
}
- Downloads last month
- 9
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support