---
license: mit
tags:
- automatic-speech-recognition
- whisper
- onnx
- quantized
---

# ONNX version of whisper-large-v3-onnx-w8a16-dynamic

This repository contains the ONNX version of the `openai/whisper-large-v3` model.

## Model Details

The original model can be found here: [openai/whisper-large-v3](https://huggingface.co/openai/whisper-large-v3)


## Quantization

This model has been quantized to **w8a16** using **dynamic quantization**.
This reduces the model size and can improve inference speed, especially on CPUs.

## Usage

The model can be used with `optimum.onnxruntime.ORTModelForSpeechSeq2Seq`.

```python
from optimum.onnxruntime import ORTModelForSpeechSeq2Seq
from transformers import WhisperProcessor

model_name = "mirekphd/whisper-large-v3-onnx-w8a16-dynamic"
processor = WhisperProcessor.from_pretrained(model_name)
model = ORTModelForSpeechSeq2Seq.from_pretrained(model_name)

# ... add your inference code here ...
```