Add/update the quantized ONNX model files and README.md for Transformers.js v3

#3
by whitphx HF Staff - opened

Applied Quantizations

βœ… Based on decoder_model_merged.onnx with slimming

The base model decoder_model_merged.onnx has been renamed to model.onnx.

↳ βœ… fp16: model_fp16.onnx (added)
↳ βœ… int8: model_int8.onnx (added)
↳ βœ… uint8: model_uint8.onnx (added)
↳ βœ… q4: model_q4.onnx (added)
↳ βœ… q4f16: model_q4f16.onnx (added)
↳ βœ… bnb4: model_bnb4.onnx (added)

Xenova changed pull request status to merged

Sign up or log in to comment