Llama Guard 1B Model Predictions After Quantization
Hi @legraphista
I am working with the Llama-Guard-3B model and noticed a significant change in classification behavior when using different model versions and tokenizers.
Setup & Observations
The base model ("meta-llama/Llama-Guard-3-1B") produces a mix of "safe" and "unsafe" responses, which seems to be the expected behavior.
When using quantized versions ("tensorblock/Llama-Guard-3-1B-GGUF" and "legraphista/Llama-Guard-3-1B-IMat-GGUF") with different tokenizers, I see
inconsistent results:
If I use the original model's tokenizer with a quantized model, it classifies everything as "unsafe".
If I use the quantized model’s tokenizer, it classifies everything as "safe".
What I’m Looking For
Has anyone else observed similar behavior when working with GGUF-quantized models?
Could this be related to how the GGUF format handles tokenization and model inference?
Are there known solutions to ensure quantized models maintain classification accuracy?