Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
KMasaki
/
DeepSeek-R1-Distill-Qwen-1.5B-GRPO
like
0
Text Generation
Transformers
Safetensors
open-r1/OpenR1-Math-220k
qwen2
Generated from Trainer
open-r1
trl
grpo
conversational
text-generation-inference
arxiv:
2402.03300
Model card
Files
Files and versions
Community
Train
Deploy
Use this model
main
DeepSeek-R1-Distill-Qwen-1.5B-GRPO
/
all_results.json
Commit History
Model save
093d421
verified
KMasaki
commited on
20 days ago
Model save
b33b9f2
verified
KMasaki
commited on
20 days ago
Model save
526ad82
verified
KMasaki
commited on
20 days ago
Model save
694f08c
verified
KMasaki
commited on
26 days ago
Model save
c736381
verified
KMasaki
commited on
27 days ago