Can I apply 4-Bit Quantization on Llama-3.2-11B-Vision-Instruct using TGI?

#89
by jaimin-at-work - opened

I am trying 4-bit Quantization on Llama-3.2-11B-Vision-Instruct model using TGI's BitandBytes method, But I am getting the below error.

"NotImplementedError: 4bit quantization is not supported for AutoModel"

Sign up or log in to comment