This 4-bit ONNX version of the model is basically unusable.

#12
by limingde - opened

The phi4-mm vision model, after INT4 quantization, is essentially unusable. The output has little to no correlation with the input or the expected vision model output.

+1, the vision model is unusable, the output always says:

The image appears to be a highly pixelated or corrupted image, making it difficult to provide a detailed description. The visible content includes various alphanumeric characters and symbols scattered across the image in a seemingly random pattern. Due to the low resolution and unclear content, a precise and detailed description cannot be accurately provided.

No matter what the input is

Sign up or log in to comment