deepseek-ai/DeepSeek-R1-0528 · Do you have deepseek-r1-0528-awq plan?

oliver0102

May 29

•

edited May 29

awq version runs smoothly on 8*H20, which is the most powerful grphics card i have.

tunglinwu

May 29

@v2ray from https://huggingface.co/cognitivecomputations will follow up with awq soon

erichartford

May 29

Maybe? awq has moved to vllm-compressor, and it doesn't 100% work for moe yet,
we will give it a try

adamo1139

May 30

I am doing DeepSeek-R1-Zero AWQ quants now. If that will finish and work successfully I'll try to make AWQ quants of this model.

oliver0102

May 30

Maybe? awq has moved to vllm-compressor, and it doesn't 100% work for moe yet,
we will give it a try

I am using your DeepSeek-R1-AWQ everyday. I can run up to 220 tokens/s with 8*H20, and I didn't feel any difference compared with orignal one. AWQ is a really good quantation method. So I am looking for any AWQ version released soon after this LITTLE upgrade of R1 released. DeepSeek-R1 version is good at programing but has big hallucination issue and prevent us to serve it as base model as one of our code-related agent. This one, hopefully, resolved this issue.

erichartford

May 30

I will try

adamo1139

May 31

@oliver0102 @erichartford

Here's an AWQ quant for the new DeepSeek R1-0528 - https://huggingface.co/adamo1139/DeepSeek-R1-0528-AWQ

Eric, thanks for your notes in the discussion here, they were really helpful.