HerrHruby/online_acemath_rl_4b_inst_hard_16k_thinking_no_summ_thinking_step_90 4B • Updated about 2 hours ago
HerrHruby/2k_hard_5_05_8_steps_e2e_explore_small_boxed_all_bonus_pos_turn_num_2_mbzs_stable_230_steps 2B • Updated 20 days ago • 1.19k
HerrHruby/2k_hard_5_05_8_steps_e2e_explore_small_boxed_all_bonus_pos_turn_num_2_mbzs_stable_170_steps 2B • Updated 20 days ago • 291
HerrHruby/2k_hard_5_05_8_steps_e2e_explore_small_boxed_all_bonus_pos_turn_num_2_mbzs_stable_130_steps 2B • Updated 21 days ago • 26
HerrHruby/sft_qwen_3_1p7b_reasoning_cache_e2e_1p7b_deepscaler_16k_18k_2048_toks_560_steps 2B • Updated Sep 19 • 2
HerrHruby/interventions_qwen3_4b_inst_nothink_prefix_RL Viewer • Updated about 1 month ago • 2.07k • 112
HerrHruby/reasoning_cache_deepscalr_16k_1p7b_sft_e2e_summaries_2048_18k Viewer • Updated Sep 19 • 18.4k • 16