GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization Paper • 2601.05242 • Published about 23 hours ago • 89
Entropy-Adaptive Fine-Tuning: Resolving Confident Conflicts to Mitigate Forgetting Paper • 2601.02151 • Published 4 days ago • 84
fal/Qwen-Image-Edit-2511-Multiple-Angles-LoRA Image-to-Image • Updated 2 days ago • 3.15k • • 221
LiquidAI/LFM2.5-1.2B-Instruct Text Generation • 1B • Updated about 12 hours ago • 5.79k • 219