ypwang61
/

One-Shot-RLVR-R1-Distill-1.5B-pi1

Model card Files Files and versions Community

README.md exists but content is empty.

Downloads last month: 27

Safetensors

Model size

1.78B params

Tensor type

F32

·

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including ypwang61/One-Shot-RLVR-R1-Distill-1.5B-pi1

One-Shot RLVR

Collections of models and papers for works: "Reinforcement Learning for Reasoning in Large Language Models with One Training Example" • 14 items • Updated 30 days ago • 1