metadata
license: mit
tags:
- automatic-speech-recognition
- whisper
- onnx
- quantized
ONNX version of whisper-large-v3-onnx-w8a16-dynamic
This repository contains the ONNX version of the openai/whisper-large-v3
model.
Model Details
The original model can be found here: openai/whisper-large-v3
Quantization
This model has been quantized to w8a16 using dynamic quantization. This reduces the model size and can improve inference speed, especially on CPUs.
Usage
The model can be used with optimum.onnxruntime.ORTModelForSpeechSeq2Seq
.
from optimum.onnxruntime import ORTModelForSpeechSeq2Seq
from transformers import WhisperProcessor
model_name = "mirekphd/whisper-large-v3-onnx-w8a16-dynamic"
processor = WhisperProcessor.from_pretrained(model_name)
model = ORTModelForSpeechSeq2Seq.from_pretrained(model_name)
# ... add your inference code here ...