Unable to convert ONNX model to INT4/FP16
#95
by
Avan2000
- opened
Hi community,
I tried converting the Gemma-7b model to onnx file with fp16 precision using the following command -
optimum-cli export onnx --dtype fp16 --device xpu --model google/gemma-7b --framework pt --task text-generation-with-past ./gemma-7b
But it is giving an error like -
Later, tried using without fp16 precision as -
optimum-cli export onnx --model google/gemma-7b --framework pt --task text-generation-with-past ./gemma-7b
And successfully got the onnx file in INT64 precision.
Now when I try to convert it to FP16/INT4 it is thowing an error as -
Any clues on what to do about this?
Hi Community, Any update on this ?