hdong0/Qwen2.5-Math-1.5B-Open-R1-GRPO_deepscaler_100steps_lr1e-6_kl1e-3_acc Text Generation • Updated 1 day ago
hdong0/Qwen2.5-Math-1.5B-Open-R1-GRPO_deepscaler_1000steps_lr1e-6_kl1e-3_acc Text Generation • Updated 1 day ago
hdong0/Qwen2.5-Math-1.5B-Open-R1-GRPO_MATH_1000steps_lr1e-6_kl1e-3_acc Text Generation • Updated about 6 hours ago
hdong0/Qwen2.5-Math-1.5B-Open-R1-Distill_deepmath_first_3epoch Text Generation • Updated about 2 hours ago