quantize deepseek-r1-0528 please

#14

by aabbccddwasd - opened Jun 2

Jun 2

r1-0528 is a awesome model, and fp4 model can achieve 80 token/s using microsoft tutel. we really need a fp4 version for r1-0528

ghostplant

Jun 14

•

edited Jun 14

r1-0528 is a awesome model, and fp4 model can achieve 80 token/s using microsoft tutel. we really need a fp4 version for r1-0528

You can also just download official deepseek-ai/DeepSeek-R1-0528, while on A100, the docker image v20250601 will inline quantize it into FP4, so as to fit into A100x8 memory.

RoundtTble

Jul 2

@ghostplant Thank you for reply. But I can not find the docker image v20250601 you mentioned.

ghostplant

Jul 5

@ghostplant Thank you for reply. But I can not find the docker image v20250601 you mentioned.

The docker instructions respectively for N-GPUs and A-GPUs: https://hub.docker.com/r/tutelgroup/deepseek-671b

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment