Given example code results in S1: Violent Crimes
#3
by
MarktHart
- opened
The given example code in the repository falsely returns S1. Is the code/model somehow broken?
It does give a correct "safe" response on float16.
Environment:
Driver Version: 550.54.15
CUDA Version: 12.4
GPU: RTX4090
Torch: 2.2.2
Transformers: 4.40.0
Thanks for flagging, we were able to repro the issue. It seems to be a bug with how we're creating the input prompt in the HF example.
We verified that the llama-recipes example works as expected. We are working on fixing the HF one.
This was fixed by a PR from the HF team on 4/19: https://huggingface.co/meta-llama/Meta-Llama-Guard-2-8B/commit/bb78080332eda00343dc37b0465b43bbf22c0251
Thanks for fixing and the follow up
MarktHart
changed discussion status to
closed