DeepScaleR-1.5B-Preview is a language model fine-tuned from DeepSeek-R1-Distilled-Qwen-1.5B. Beats o1 preview in math.
ThomasBaruzier
ThomasBaruzier
AI & ML interests
None yet
Recent Activity
new activity
5 days ago
ikawrakow/Qwen3-30B-A3B:GitHub account and ik_llama.cpp are down?!
liked
a model
10 days ago
nvidia/canary-qwen-2.5b
liked
a model
11 days ago
huihui-ai/DeepSeek-V3-abliterated
Organizations
None yet
EXAONE-3.5
EXAONE 3.5 language model series including instruction-tuned models of 2.4B, 7.8B, and 32B.
Qwen QwQ
Qwen with Questions
Llama 3.2 Instruct
Llama 3.2 language models, featuring instruction-tuned models of 2 sizes, including 1B and 3B.
Llama 3 Instruct
Llama 3 language models, featuring instruction-tuned models of 2 sizes, including 8B and 70B.
DeepSeek-R1-ReDistill
Re-distilled DeepSeek R1 models
Qwen 2.5 Coder Instruct
Code-specific model series based on Qwen2.5
-
ThomasBaruzier/Qwen2.5-Coder-0.5B-Instruct-GGUF
Text Generation • 0.5B • Updated • 235 • 1 -
ThomasBaruzier/Qwen2.5-Coder-1.5B-Instruct-GGUF
Text Generation • 2B • Updated • 194 -
ThomasBaruzier/Qwen2.5-Coder-3B-Instruct-GGUF
Text Generation • 3B • Updated • 264 -
ThomasBaruzier/Qwen2.5-Coder-7B-Instruct-GGUF
Text Generation • 8B • Updated • 205
Qwen 2.5 Instruct
Qwen 2.5 language models, featuring instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B.
-
ThomasBaruzier/Qwen2.5-0.5B-Instruct-GGUF
Text Generation • 0.5B • Updated • 560 -
ThomasBaruzier/Qwen2.5-1.5B-Instruct-GGUF
Text Generation • 2B • Updated • 451 -
ThomasBaruzier/Qwen2.5-3B-Instruct-GGUF
Text Generation • 3B • Updated • 104 -
ThomasBaruzier/Qwen2.5-7B-Instruct-GGUF
Text Generation • 8B • Updated • 109
Llama 3.1 Instruct
Llama 3.1 language models, featuring instruction-tuned models of 3 sizes, including 8B, 70B, and 405B.
Gemma 2
Gemma 2 language models, featuring instruction-tuned models of 3 sizes, including 2B, 9B, and 27B.
DeepScaleR-1.5B-Preview
DeepScaleR-1.5B-Preview is a language model fine-tuned from DeepSeek-R1-Distilled-Qwen-1.5B. Beats o1 preview in math.
DeepSeek-R1-ReDistill
Re-distilled DeepSeek R1 models
EXAONE-3.5
EXAONE 3.5 language model series including instruction-tuned models of 2.4B, 7.8B, and 32B.
Qwen 2.5 Coder Instruct
Code-specific model series based on Qwen2.5
-
ThomasBaruzier/Qwen2.5-Coder-0.5B-Instruct-GGUF
Text Generation • 0.5B • Updated • 235 • 1 -
ThomasBaruzier/Qwen2.5-Coder-1.5B-Instruct-GGUF
Text Generation • 2B • Updated • 194 -
ThomasBaruzier/Qwen2.5-Coder-3B-Instruct-GGUF
Text Generation • 3B • Updated • 264 -
ThomasBaruzier/Qwen2.5-Coder-7B-Instruct-GGUF
Text Generation • 8B • Updated • 205
Qwen QwQ
Qwen with Questions
Qwen 2.5 Instruct
Qwen 2.5 language models, featuring instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B.
-
ThomasBaruzier/Qwen2.5-0.5B-Instruct-GGUF
Text Generation • 0.5B • Updated • 560 -
ThomasBaruzier/Qwen2.5-1.5B-Instruct-GGUF
Text Generation • 2B • Updated • 451 -
ThomasBaruzier/Qwen2.5-3B-Instruct-GGUF
Text Generation • 3B • Updated • 104 -
ThomasBaruzier/Qwen2.5-7B-Instruct-GGUF
Text Generation • 8B • Updated • 109
Llama 3.2 Instruct
Llama 3.2 language models, featuring instruction-tuned models of 2 sizes, including 1B and 3B.
Llama 3.1 Instruct
Llama 3.1 language models, featuring instruction-tuned models of 3 sizes, including 8B, 70B, and 405B.
Llama 3 Instruct
Llama 3 language models, featuring instruction-tuned models of 2 sizes, including 8B and 70B.
Gemma 2
Gemma 2 language models, featuring instruction-tuned models of 3 sizes, including 2B, 9B, and 27B.