Distilled Reasoning Models with Activation Sparse
AI & ML interests
ML algorithms and systems
Recent Activity
Reproduce Deepseek distilled models based on open-r1.
-
InfiniAILab/OpenR1-Qwen-3B-SFT-Instruct
Text Generation • 3B • Updated • 4 • 1 -
InfiniAILab/OpenR1-Qwen-7B-SFT-Instruct
Text Generation • 8B • Updated • 166 • 2 -
InfiniAILab/OpenR1-Qwen-7B-Math-Instruct
Text Generation • 8B • Updated • 4 -
InfiniAILab/OpenR1-Qwen-1.5B-SFT-Instruct
Text Generation • 2B • Updated • 2
Distilled Reasoning Models with Activation Sparse
Reproduce Deepseek distilled models based on open-r1.
-
InfiniAILab/OpenR1-Qwen-3B-SFT-Instruct
Text Generation • 3B • Updated • 4 • 1 -
InfiniAILab/OpenR1-Qwen-7B-SFT-Instruct
Text Generation • 8B • Updated • 166 • 2 -
InfiniAILab/OpenR1-Qwen-7B-Math-Instruct
Text Generation • 8B • Updated • 4 -
InfiniAILab/OpenR1-Qwen-1.5B-SFT-Instruct
Text Generation • 2B • Updated • 2
Draft models for Llama, Qwen, QwQ, Mistral ...