quantize deepseek-r1-0528 please
#14
by
aabbccddwasd
- opened
r1-0528 is a awesome model, and fp4 model can achieve 80 token/s using microsoft tutel. we really need a fp4 version for r1-0528
r1-0528 is a awesome model, and fp4 model can achieve 80 token/s using microsoft tutel. we really need a fp4 version for r1-0528
You can also just download official deepseek-ai/DeepSeek-R1-0528
, while on A100, the docker image v20250601 will inline quantize it into FP4, so as to fit into A100x8 memory.
@ghostplant Thank you for reply. But I can not find the docker image v20250601 you mentioned.
The docker instructions respectively for N-GPUs and A-GPUs: https://hub.docker.com/r/tutelgroup/deepseek-671b