·
AI & ML interests
None yet
Organizations
None yet
zheminh/grpo-0612-verl-math-fmt-with-acc-strict-step08
8B
•
Updated
•
1
zheminh/grpo-0609-verl-math-step09
8B
•
Updated
•
2
8B
•
Updated
•
2
zheminh/grpo-0602-verl-step12
8B
•
Updated
•
2
8B
•
Updated
•
1
zheminh/grpo-0531-ckpt150
8B
•
Updated
•
2
8B
•
Updated
•
2
zheminh/grpo-0529-length-comparison-ckpt90
8B
•
Updated
8B
•
Updated
•
1
zheminh/grpo-0527-ckpt120
8B
•
Updated
•
2
8B
•
Updated
zheminh/DeepSeek-R1-Distill-Qwen-7B-Trajectory-Data-MATH-Olympiad-Merged-Chains-with-prompt-ckpt1200
8B
•
Updated
•
1
zheminh/DeepSeek-R1-Distill-Qwen-7B-Trajectory-Data-Olympiad-Merged-Chains-ckpt1200
8B
•
Updated
•
1
zheminh/DeepSeek-R1-Distill-Qwen-7B-Trajectory-Data-MATH-Olympiad-Merged-Chains-ckpt1600
8B
•
Updated
•
1
zheminh/DeepSeek-R1-Distill-Qwen-7B-Trajectory-Data-MATH-Merged-Chains-ckpt700
8B
•
Updated
•
4
zheminh/DeepSeek-R1-Distill-Qwen-7B-Trajectory-Data-MATH-Merged-Chains-ckpt300
8B
•
Updated
•
2
zheminh/DeepSeek-R1-Distill-Qwen-7B-Trajectory-Data-Math-DAPO-ckpt240
8B
•
Updated
•
1
zheminh/DeepSeek-R1-Distill-Qwen-7B-Trajectory-Data-Math-DAPO-ckpt180
zheminh/DeepSeek-R1-Distill-Qwen-7B-Trajectory-Data-Math-DAPO-ckpt210
8B
•
Updated
•
1
zheminh/DeepSeek-R1-Distill-Qwen-7B-Trajectory-Data-MATH-Olympiad-ckpt1200
zheminh/DeepSeek-R1-Distill-Qwen-7B-Trajectory-Data-MATH-Olympiad-ckpt800
8B
•
Updated
•
1
zheminh/DeepSeek-R1-Distill-Qwen-7B-DAPO-ckpt90
8B
•
Updated
•
1
zheminh/DeepSeek-R1-Distill-Qwen-7B-RFT-R1-Distill-ckpt400
8B
•
Updated
•
1
zheminh/DeepSeek-R1-Distill-Qwen-7B-Trajectory-Data-Math-ckpt600
8B
•
Updated
•
1
zheminh/DeepSeek-R1-Distill-Qwen-7B-Trajectory-Data-Math-ckpt500
zheminh/DeepSeek-R1-Distill-Qwen-7B-Trajectory-Data-Math-ckpt300
8B
•
Updated
•
1
zheminh/DeepSeek-R1-Distill-Qwen-7B-Trajectory-Data-Math-ckpt400
8B
•
Updated
•
2
zheminh/DeepSeek-R1-Distill-Qwen-7B-Trajectory-Data-Math-ckpt150
8B
•
Updated
•
1
zheminh/Qwen3-8B-Execution-Data-Math-Orig-and-Dagger-think-ckpt4800
8B
•
Updated
•
1