π§ Khmer Whisper ASR β songhieng/whisper-khmer-v1
This is a fine-tuned version of openai/whisper-small
on Khmer-language audio data. It is designed for automatic speech recognition (ASR) tasks in Khmer (ααααα).
Training Loss 5200 0.095900
π§ Model Details
- Base model:
openai/whisper-tiny
- Language: Khmer (
km
) - Task: Transcription (
task="transcribe"
) - Fine-tuned on: Male voice Khmer audio dataset from tts-data-kh
- Owner: songhieng
π How to Use
from transformers import WhisperProcessor, WhisperForConditionalGeneration
import torchaudio
import torch
# Load processor and model
processor = WhisperProcessor.from_pretrained("songhieng/whisper-khmer-v1")
model = WhisperForConditionalGeneration.from_pretrained("songhieng/whisper-khmer-v1")
model.eval()
# Set to avoid forced decoder ID error
model.config.forced_decoder_ids = None
model.generation_config.forced_decoder_ids = None
# Load and resample audio
speech_array, sampling_rate = torchaudio.load("your_audio.wav")
if sampling_rate != 16000:
resampler = torchaudio.transforms.Resample(sampling_rate, 16000)
speech_array = resampler(speech_array)
# Extract features
input_features = processor(speech_array.squeeze(), return_tensors="pt").input_features
# Transcribe
with torch.no_grad():
predicted_ids = model.generate(input_features)
text = processor.batch_decode(predicted_ids, skip_special_tokens=True)[0]
print("π Transcription:", text)
- Downloads last month
- 3
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support