Llama Guard 1B Model Predictions After Quantization

#1
by Baharababah - opened

Hi @legraphista

I am working with the Llama-Guard-3B model and noticed a significant change in classification behavior when using different model versions and tokenizers.

Setup & Observations

  • The base model ("meta-llama/Llama-Guard-3-1B") produces a mix of "safe" and "unsafe" responses, which seems to be the expected behavior.

  • When using quantized versions ("tensorblock/Llama-Guard-3-1B-GGUF" and "legraphista/Llama-Guard-3-1B-IMat-GGUF") with different tokenizers, I see

inconsistent results:

  • If I use the original model's tokenizer with a quantized model, it classifies everything as "unsafe".

  • If I use the quantized model’s tokenizer, it classifies everything as "safe".

What I’m Looking For

Has anyone else observed similar behavior when working with GGUF-quantized models?

Could this be related to how the GGUF format handles tokenization and model inference?

Are there known solutions to ensure quantized models maintain classification accuracy?

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment