Add/update the quantized ONNX model files and README.md for Transformers.js v3

by whitphx HF Staff - opened about 17 hours ago

base: refs/heads/main

←

from: refs/pr/3

Discussion Files changed

+24

-3

whitphx

about 17 hours ago

Applied Quantizations

✅ Based on `decoder_model_merged.onnx` with slimming

The base model decoder_model_merged.onnx has been renamed to model.onnx.

↳ ✅ fp16: model_fp16.onnx (added)
↳ ✅ int8: model_int8.onnx (added)
↳ ✅ uint8: model_uint8.onnx (added)
↳ ✅ q4: model_q4.onnx (added)
↳ ✅ q4f16: model_q4f16.onnx (added)
↳ ✅ bnb4: model_bnb4.onnx (added)

Add/update the quantized ONNX model files and README.md for Transformers.js v37df8dd8d

Xenova changed pull request status to merged about 2 hours ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Add/update the quantized ONNX model files and README.md for Transformers.js v3

Applied Quantizations

✅ Based on decoder_model_merged.onnx with slimming

✅ Based on `decoder_model_merged.onnx` with slimming