models 80
VerlTool/SWE-Qwen3-8B-VT-grpo-n32-b256-t1.0-lr2e-6
8B • Updated • 2
VerlTool/pixel_reasoner-7b-grpo-n8-b128-t1.0-lr1e-6-complex-reward-new_global_step_50
8B • Updated VerlTool/deepsearch-qwen_qwen3-8b-grpo-n16-b128-t1.0-lr1e-6-new_global_step_70
8B • Updated • 2
VerlTool/torl_qwen2.5-math-7b-grpo-n16-b128-t1.0-lr1e-6acc-only-global_step_200
8B • Updated VerlTool/pixel_reasoner-7b-grpo-n8-b128-t1.0-lr1e-6-complex-reward_global_step_90
8B • Updated VerlTool/pixel-reaoner-3b-grpo-n8-b128-t1.0-lr1e-6-complex-reward_global_step_100
4B • Updated • 1
VerlTool/torl_qwen2.5-math-1.5b-grpo-n16-b128-t1.0-lr1e-6-acc-only_global_step_340
2B • Updated • 2
VerlTool/sqlcoder-qwen2.5-coder-7b-instruct-grpo-n5-b256-t0.6-lr1e-6_global_step_60
8B • Updated • 3
VerlTool/search_r1_qa_em-qwen_qwen2.5-7b-grpo-n16-b512-64-t1.0-lr1e-6-dapo_global_step_140
8B • Updated VerlTool/pixel_reasoner-7b-grpo-n8-b128-t1.0-lr1e-6_global_step_80
8B • Updated datasets 11
VerlTool/SkyRL-SQL-Reproduction
Viewer
• Updated • 5.91k • 97
Viewer
• Updated • 4.8k • 91
Viewer
• Updated • 10.2k • 8
VerlTool/AceCoderV2-69K-cleaned
Viewer
• Updated • 69k • 5
VerlTool/AceCoderV2-122K-cleaned
Viewer
• Updated • 123k • 6
Viewer
• Updated • 123k • 29
Viewer
• Updated • 69k • 10
VerlTool/openmathreasoning_tir_100K
Viewer
• Updated • 104k • 29
Viewer
• Updated • 29.1k • 25
VerlTool/Interpreter-Thinking
Viewer
• Updated • 152k • 7