Will there be an ONNX version that is not quantized?

by Maximum2000 - opened 5 days ago

5 days ago

Are you guys planning on releasing an ONNX version that is not quantized? Simillar to how we have GGUF version?

5 days ago

It's ok I generated my own.

Maximum2000 changed discussion status to closed 5 days ago

1 day ago

@Maximum2000 Can you please share the recipe for converting the ONNX model?

1 day ago

I used this doc: phi-4-multi-modal.md but this is set by default to be quantized four. If you want to not have it quantized follow this issue: Phi-4-multimodel-instruct ONNX fp32 fails to load.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment