saurabh5/olmo-3-preference-mix-deltas_reasoning-yolo_even_split-DECON-no-chinese Viewer • Updated 2 days ago • 526k • 27
saurabh5/rlvr-prompts_responses-mixin_it_up-v2-filtered-no-chinese Viewer • Updated 3 days ago • 131k • 72
saurabh5/rlvr_mixin_it_up_prompts-qwen25-r1-distill-32b-1_5B-thoughts-x16-filtered-no-chinese Viewer • Updated 4 days ago • 97.6k • 133
saurabh5/rlvr_mixin_it_up_prompts-qwen25-r1-distill-32b-1_5B-thoughts-x16 Viewer • Updated 4 days ago • 95k • 124
saurabh5/rlvr_mixin_it_up_prompts-qwen3-32b-06B-thoughts-x8-filtered-no-chinese Viewer • Updated 7 days ago • 87k • 159
saurabh5/rlvr_mixin_it_up_prompts-qwen3-32b-06B-thoughts-x8 Viewer • Updated 7 days ago • 85.9k • 369
saurabh5/rlvr_mixin_it_up_prompts-qwen3-32b-06B-thoughts-x8-filtered Viewer • Updated 9 days ago • 97.5k • 105