quant script

by ehartford - opened 8 days ago

Discussion

ehartford

8 days ago

can you please share the exact quant script you used?

ehartford

8 days ago

I am guessing you first converted to bf16 and then from there to w4a16?

did you use deepseek's script for the bf16 conversion?

https://huggingface.co/deepseek-ai/DeepSeek-V3/blob/main/inference/fp8_cast_bf16.py

ekurtic

Red Hat AI org 5 days ago

We have used the https://github.com/IST-DASLab/MoE-Quant for quantization. It will do casting into bf16 on the fly, but the code is basically DeepSeek's script you are pointing to.

ekurtic changed discussion status to closed 5 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment