ghostplant's picture

15 2

ghostplant

ghostplant

AI & ML interests

None yet

Recent Activity

new activity 3 days ago

deepseek-ai/DeepSeek-R1-0528:刚部署满血deepseek r1 0528版本，推理性能提升这么多嘛？不是架构没变嘛？

new activity 6 days ago

deepseek-ai/DeepSeek-R1-0528:How to run 0528version on GPU which don't support FP8

new activity 7 days ago

deepseek-ai/DeepSeek-R1-0528:这个问题大家的输出是什么？

View all activity

Organizations

None yet

ghostplant's activity

New activity in deepseek-ai/DeepSeek-R1-0528 3 days ago

刚部署满血deepseek r1 0528版本，推理性能提升这么多嘛？不是架构没变嘛？

#75 opened 7 days ago by

New activity in deepseek-ai/DeepSeek-R1-0528 6 days ago

How to run 0528version on GPU which don't support FP8

#64 opened 7 days ago by

New activity in deepseek-ai/DeepSeek-R1-0528 7 days ago

这个问题大家的输出是什么？

#49 opened 8 days ago by

New activity in unsloth/DeepSeek-R1-GGUF about 2 months ago

Share a mmlu test result,I use 2.51bit,and compare with ds api, baidu's ds,it seems 2.51bit is very smart at least in mmlu

#42 opened 3 months ago by

New activity in deepseek-ai/DeepSeek-R1 2 months ago

Does R1 support long context (> 4K)?

#172 opened 3 months ago by

New activity in nvidia/DeepSeek-R1-FP4 2 months ago

can this model run on Hopper GPU

#8 opened 3 months ago by

can this model run on A800 ?

#10 opened 3 months ago by

Why not use FP2 or IQ2 as kTransformers does?

#11 opened 3 months ago by

New activity in deepseek-ai/DeepSeek-R1 3 months ago

Deploying production ready service with Unsloth GGUF quants on your AWS account. (4 x L40S)

#171 opened 3 months ago by

samagra-tensorfuse

90+ tokens per second for MI300x8 using batch_size = 1

#166 opened 3 months ago by

New activity in unsloth/DeepSeek-R1-GGUF 3 months ago

Q2_K_XL 好还是 Q4好呢

#34 opened 4 months ago by

New activity in deepseek-ai/DeepSeek-R1 4 months ago

所以部署一个671B的模型显存需要多少有什么基准的硬件配置？

#118 opened 4 months ago by

New activity in deepseek-ai/DeepSeek-R1-Distill-Llama-70B 4 months ago

How much vram do you need?

#12 opened 4 months ago by

New activity in unsloth/DeepSeek-R1-GGUF 4 months ago

Is there a model removing non-shared MoE experts?

#17 opened 4 months ago by

New activity in deepseek-ai/DeepSeek-R1-Distill-Qwen-32B 4 months ago

Please convert these models to GGUF format...

#12 opened 4 months ago by