YuchenLi01/generatedSoft_Qwen2.5Math1.5BInstruct_dpo_ebs32_lr5e-07_beta0.4_42 Text Generation • Updated 2 days ago
YuchenLi01/ultrafeedbackSkyworkAgree_alignmentZephyr7BSftFull_sdpo_score_ebs64_lr1e-07_0 Text Generation • Updated 24 days ago • 10
YuchenLi01/ultrafeedbackSkyworkAgree_alignmentZephyr7BSftFull_sdpo_score_ebs64_lr5e-06_0 Text Generation • Updated 24 days ago • 123
YuchenLi01/ultrafeedbackSkyworkAgree_alignmentZephyr7BSftFull_sdpo_score_ebs32_lr5e-07_0 Text Generation • Updated 24 days ago • 6
YuchenLi01/ultrafeedbackSkyworkAgree_alignmentZephyr7BSftFull_sdpo_score_ebs64_lr5e-07_0 Text Generation • Updated 25 days ago • 5
YuchenLi01/ultrafeedbackSkyworkAgree_alignmentZephyr7BSftFull_sdpo_score_ebs64_lr1e-06_2 Text Generation • Updated 25 days ago • 4
YuchenLi01/ultrafeedbackSkyworkAgree_alignmentZephyr7BSftFull_sdpo_score_ebs32_lr1e-06_0 Text Generation • Updated 26 days ago • 5
YuchenLi01/ultrafeedbackSkyworkAgree_alignmentZephyr7BSftFull_sdpo_score_ebs64_lr5e-07_1 Text Generation • Updated 26 days ago • 4
YuchenLi01/ultrafeedbackSkyworkAgree_alignmentZephyr7BSftFull_sdpo_score_ebs128_lr1e-06_2 Text Generation • Updated 27 days ago • 5
YuchenLi01/ultrafeedbackSkyworkAgree_alignmentZephyr7BSftFull_sdpo_score_ebs32_lr1e-06_2 Text Generation • Updated 27 days ago • 5
YuchenLi01/MATH_Qwen2.5Math1.5BInstruct_Soft_DPO_Qwen2.5MathPRM72B_HardNoGT Viewer • Updated 24 minutes ago • 2.52k
YuchenLi01/MATH_Qwen2.5Math1.5BInstruct_Soft_DPO_Qwen2.5MathPRM72B_Hard Viewer • Updated about 2 hours ago • 3.14k
YuchenLi01/MATH_Qwen2.5Math1.5BInstruct_Soft_DPO_Qwen2.5MathPRM72B Viewer • Updated 1 day ago • 7.05k • 10
YuchenLi01/Math-Step-DPO-10K-augmented-Qwen2.5MathPRM72B Viewer • Updated about 1 month ago • 10.8k • 82