4a16 quant?

#1
by twhitworth - opened

Love to see use this model quantized in w4a16.

Appreciate all the commits to vllm guys !! I'll have to drop in live to office hours soon.

I'll have to start messing with LLM compressor once I finish my current project.

Red Hat AI org

Thanks for the feedback. I'm not sure I understood it correctly, but if you're looking for a w4a16 version of the model we have it: https://huggingface.co/RedHatAI/phi-4-quantized.w4a16

alexmarques changed discussion status to closed

Sign up or log in to comment