WhisperConformer
WhisperConformer is a modified ASR model from Whisper and Conformer architecture to capture global context and local feature extraction to improve recognition accuracy. The model is trained on 387 hours of Thai speech data.
This Model Can be fine-tuning based on this Fine-Tune Whisper with ๐ค Transformers
Usage
pip install --upgrade pip
pip install WhisperConformer
The model can be used with the pipeline class to transcribe audios of arbitrary length:
from transformers import pipeline
from transformers import WhisperTokenizer,WhisperFeatureExtractor
from WhisperConformer import WhisperConformerModel
model_name = "Thanakron/whisperConformer-medium-th"
feature_extractor = WhisperFeatureExtractor.from_pretrained(model_name)
tokenizer = WhisperTokenizer.from_pretrained(model_name)
model = WhisperConformerModel.from_pretrained(model_name)
pipe = pipeline(task="automatic-speech-recognition",model=model,tokenizer=tokenizer,feature_extractor=feature_extractor)
input = "audio.wav"
def transcribe(audio):
text = pipe(audio)["text"]
return text
print(transcribe(input))
- Downloads last month
- 3
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
HF Inference deployability: The model has no library tag.
Model tree for Thanakron/whisperConformer-medium-th
Base model
openai/whisper-medium