--- license: mit tags: - automatic-speech-recognition - whisper - onnx - quantized --- # ONNX version of whisper-large-v3-onnx-w8a16-dynamic This repository contains the ONNX version of the `openai/whisper-large-v3` model. ## Model Details The original model can be found here: [openai/whisper-large-v3](https://huggingface.co/openai/whisper-large-v3) ## Quantization This model has been quantized to **w8a16** using **dynamic quantization**. This reduces the model size and can improve inference speed, especially on CPUs. ## Usage The model can be used with `optimum.onnxruntime.ORTModelForSpeechSeq2Seq`. ```python from optimum.onnxruntime import ORTModelForSpeechSeq2Seq from transformers import WhisperProcessor model_name = "mirekphd/whisper-large-v3-onnx-w8a16-dynamic" processor = WhisperProcessor.from_pretrained(model_name) model = ORTModelForSpeechSeq2Seq.from_pretrained(model_name) # ... add your inference code here ... ```