This 4-bit ONNX version of the model is basically unusable.
#12
by
limingde
- opened
The phi4-mm vision model, after INT4 quantization, is essentially unusable. The output has little to no correlation with the input or the expected vision model output.
+1, the vision model is unusable, the output always says:
The image appears to be a highly pixelated or corrupted image, making it difficult to provide a detailed description. The visible content includes various alphanumeric characters and symbols scattered across the image in a seemingly random pattern. Due to the low resolution and unclear content, a precise and detailed description cannot be accurately provided.
No matter what the input is