luckeciano/Qwen-2.5-7B-RL-LACPO-NoBaselineNoKLNoEntropyNoSmooth Text Generation • Updated 15 days ago • 2
luckeciano/Qwen-2.5-7B-RL-LACPO-NoBaselineNoKLNoEntropy0.5NoSmooth Text Generation • Updated 24 days ago • 1
luckeciano/Qwen-2.5-7B-RL-LACPO-NoBaselineNoKLNoEntropy0.5Smooth10 Text Generation • Updated 2 days ago
luckeciano/Qwen-2.5-7B-RL-LACPO-BaselineNoKLNoEntropy0.1Smooth10 Text Generation • Updated 22 days ago • 3
s-a-malik/Qwen-2.5-7B-Embedding-Entropy-0.45-Missing-Response Text Generation • Updated 14 days ago • 1
luckeciano/Qwen-2.5-7B-RL-LACPO-BaselineNoKLNoEntropyNoSmoothSoftLabel Text Generation • Updated 10 days ago • 542
luckeciano/Qwen-2.5-7B-RL-LACPO-BaselineNoKLNoEntropyNoSmoothVF0.1 Text Generation • Updated 9 days ago • 1
chenggong1995/Qwen2.5-Math-7B-gen8-math3to5-ghpo-cold0-3Dhint-prompt1-epoch1 Text Generation • Updated 5 days ago • 2
chenggong1995/Qwen2.5-Math-7B-gen8-math3to5-grpo-beta0-epoch1 Text Generation • Updated 5 days ago • 1
luckeciano/Qwen-2.5-7B-RL-LACPO-BaselineNoKLNoEntropyNoSmoothSoftLabelNormAdv Text Generation • Updated 2 days ago • 271