Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
KMasaki
/
DeepSeek-R1-Distill-Qwen-1.5B-GRPO
like
0
Text Generation
Transformers
Safetensors
open-r1/OpenR1-Math-220k
qwen2
Generated from Trainer
open-r1
trl
grpo
conversational
text-generation-inference
arxiv:
2402.03300
Model card
Files
Files and versions
Community
Train
Deploy
Use this model
main
DeepSeek-R1-Distill-Qwen-1.5B-GRPO
/
config.json
Commit History
End of training
a0a30e7
verified
KMasaki
commited on
20 days ago
Training in progress, step 3347
1182be3
verified
KMasaki
commited on
20 days ago
End of training
8578253
verified
KMasaki
commited on
20 days ago
Training in progress, step 3200
805daa7
verified
KMasaki
commited on
20 days ago
End of training
1b3a6df
verified
KMasaki
commited on
20 days ago
Training in progress, step 400
aed6d0d
verified
KMasaki
commited on
24 days ago
End of training
9f545cf
verified
KMasaki
commited on
26 days ago
Training in progress, epoch 0
b482d92
verified
KMasaki
commited on
26 days ago
End of training
249dd33
verified
KMasaki
commited on
27 days ago
Training in progress, epoch 0
e447237
verified
KMasaki
commited on
27 days ago