[Cache Request] deepseek-ai/DeepSeek-R1-Distill-Qwen-32B
#359
by
kvasist
- opened
Please add deepseek-ai/DeepSeek-R1-Distill-Qwen-32B
to the neuron cache. We currently have it with sequence_length=4096
, we need models with higher context window.