HerrHruby/offline_acemath_rl_4b_inst_hard_with_dishsoap_small_16k_thinking_no_summ_curr_step_100 4B • Updated about 10 hours ago • 11
HerrHruby/offline_acemath_rl_4b_inst_hard_with_dishsoap_small_16k_thinking_no_summ_curr_step_70 4B • Updated 1 day ago • 28
HerrHruby/online_acemath_rl_4b_hard_16k_thinking_no_summ_2_steps_2_samples_4b_base_028_clip_step_110 4B • Updated 1 day ago • 64
HerrHruby/online_acemath_rl_4b_hard_16k_thinking_no_summ_2_steps_2_samples_4b_base_027_clip_step_90 4B • Updated 3 days ago • 18
HerrHruby/online_pope_rl_800_4_steps_2_samples_omit_initial_thinking_step_2_step_140 4B • Updated 5 days ago • 16
HerrHruby/online_pope_rl_800_4_steps_2_samples_omit_initial_thinking_step_2_step_96 4B • Updated 7 days ago • 3
HerrHruby/online_acemath_rl_4b_inst_hard_16k_thinking_no_summ_2_steps_2_samples_4b_base_buggy_step_80 4B • Updated 7 days ago • 11
HerrHruby/online_acemath_rl_4b_inst_hard_with_dishsoap_curr_trained_90_steps_3_steps_2_samples_60_steps 4B • Updated 11 days ago • 135
HerrHruby/online_acemath_rl_4b_inst_hard_with_dishsoap_curr_trained_90_steps_3_steps_2_samples_20_steps 4B • Updated 12 days ago • 12
HerrHruby/online_acemath_rl_4b_inst_hard_16k_thinking_no_summ_4_steps_2_samples_step_200 4B • Updated 16 days ago • 16
HerrHruby/online_acemath_rl_4b_inst_hard_16k_thinking_no_summ_4_steps_2_samples_step_100 4B • Updated 17 days ago • 25
HerrHruby/offline_acemath_rl_4b_inst_hard_16k_thinking_no_summ_step_120 4B • Updated 18 days ago • 18
HerrHruby/online_acemath_rl_4b_inst_hard_16k_thinking_vanilla_like_step_90 4B • Updated 24 days ago • 66
HerrHruby/online_acemath_rl_4b_inst_hard_16k_thinking_no_summ_thinking_step_90 4B • Updated 24 days ago • 664
HerrHruby/acemath_rl_4b_inst_hard_2_steps_complete_16k_thinking_step_50 4B • Updated 25 days ago • 16
HerrHruby/2k_hard_5_05_8_steps_e2e_explore_small_boxed_all_bonus_pos_turn_num_2_mbzs_stable_230_steps 2B • Updated Oct 1 • 20
HerrHruby/2k_hard_5_05_8_steps_e2e_explore_small_boxed_all_bonus_pos_turn_num_2_mbzs_stable_170_steps 2B • Updated Oct 1 • 12
HerrHruby/2k_hard_5_05_8_steps_e2e_explore_small_boxed_all_bonus_pos_turn_num_2_mbzs_stable_130_steps 2B • Updated Sep 30 • 4
HerrHruby/sft_qwen_3_1p7b_reasoning_cache_e2e_1p7b_deepscaler_16k_18k_2048_toks_560_steps 2B • Updated Sep 19 • 2
HerrHruby/sft_qwen_3_1p7b_reasoning_cache_e2e_deepscaler_16k_38k_2048_toks_2100_steps 2B • Updated Sep 15 • 2
HerrHruby/sft_qwen_3_1p7b_reasoning_cache_e2e_deepscaler_16k_2048_toks_560_steps 2B • Updated Sep 13 • 8
HerrHruby/sft_qwen_3_1p7b_reasoning_cache_e2e_deepscaler_16k_2048_toks_400_steps 2B • Updated Sep 13 • 6