Add/update the quantized ONNX model files and README.md for Transformers.js v3
#3
by
whitphx
HF Staff
- opened
Applied Quantizations
β
Based on decoder_model_merged.onnx
with slimming
The base model decoder_model_merged.onnx
has been renamed to model.onnx
.
β³ β
fp16
: model_fp16.onnx
(added)
β³ β
int8
: model_int8.onnx
(added)
β³ β
uint8
: model_uint8.onnx
(added)
β³ β
q4
: model_q4.onnx
(added)
β³ β
q4f16
: model_q4f16.onnx
(added)
β³ β
bnb4
: model_bnb4.onnx
(added)
Xenova
changed pull request status to
merged