4a16 quant?
#1
by
twhitworth
- opened
Love to see use this model quantized in w4a16.
Appreciate all the commits to vllm guys !! I'll have to drop in live to office hours soon.
I'll have to start messing with LLM compressor once I finish my current project.
Thanks for the feedback. I'm not sure I understood it correctly, but if you're looking for a w4a16 version of the model we have it: https://huggingface.co/RedHatAI/phi-4-quantized.w4a16
alexmarques
changed discussion status to
closed