comparison to official QAT
#2
by
eramax
- opened
Have you done any comparison between this one and the official QAT one.
I looked into the official QAT GGUF ones, and they are actually slightly different from this one. My ones are converted from Kaggle Flax https://www.kaggle.com/models/google/gemma-3/flax. Google don't seem to release any details on the differences (and why). The biggest difference is that the Kaggle Flax QAT INT4 also has quantization on the embeddings, while the official GGUFs do not. Other params are also slightly different.
In terms of accuracy, I don't have the resources to test them, but anyone is free to test. I don't think the INT4 embeddings is worth the reduction in size for 27B model (17GB -> 15GB), but maybe it's worth it for 1B model (1GB -> 0.5GB)