rulins/qwen3-8B-sft-mix-v20250907_SFT_v1_mixture Text Generation • 8B • Updated 3 days ago • 14
rulins/qwen3-8B-sft-mix-v20250907-4ep-fixed-template Text Generation • 8B • Updated 3 days ago • 10
rulins/qwen3-8b-combined-sft-training-data-v20250901_nothinking Text Generation • 8B • Updated 9 days ago • 25
rulins/250717_scholarqabench_with_gt_rlvr_with_system_prompt_qwq_rm_step500 8B • Updated Jul 17 • 10
rulins/250717_rl_rag_longform_rubrics_only_with_system_prompt_qwq_rm_step_400 8B • Updated Jul 17 • 13