quantize deepseek-r1-0528 please

#14
by aabbccddwasd - opened

r1-0528 is a awesome model, and fp4 model can achieve 80 token/s using microsoft tutel. we really need a fp4 version for r1-0528

r1-0528 is a awesome model, and fp4 model can achieve 80 token/s using microsoft tutel. we really need a fp4 version for r1-0528

You can also just download official deepseek-ai/DeepSeek-R1-0528, while on A100, the docker image v20250601 will inline quantize it into FP4, so as to fit into A100x8 memory.

@ghostplant Thank you for reply. But I can not find the docker image v20250601 you mentioned.

@ghostplant Thank you for reply. But I can not find the docker image v20250601 you mentioned.

The docker instructions respectively for N-GPUs and A-GPUs: https://hub.docker.com/r/tutelgroup/deepseek-671b

Sign up or log in to comment