SWE-SQL: Illuminating LLM Pathways to Solve User SQL Issues in Real-World Applications Paper • 2506.18951 • Published Jun 23 • 21
view article Article Letting Large Models Debate: The First Multilingual LLM Debate Competition By xuanricheng and 11 others • Nov 20, 2024 • 32
Running on CPU Upgrade 13.4k 13.4k Open LLM Leaderboard 🏆 Track, rank and evaluate open LLMs and chatbots
WARM: On the Benefits of Weight Averaged Reward Models Paper • 2401.12187 • Published Jan 22, 2024 • 20 • 7
WARM: On the Benefits of Weight Averaged Reward Models Paper • 2401.12187 • Published Jan 22, 2024 • 20 • 7