Agent tuning THUDM/SWE-Dev-train Viewer • Updated 6 days ago • 20.1k • 176 • 6 SWE-Gym/OpenHands-SFT-Trajectories Viewer • Updated May 10 • 491 • 289 • 10 lmarena-ai/webdev-arena-preference-10k Viewer • Updated Mar 10 • 10.5k • 312 • 10 SWE-bench/SWE-smith-trajectories Viewer • Updated May 9 • 5.02k • 1.01k • 18
Agent Benchmarks xw27/scibench Viewer • Updated May 6, 2024 • 692 • 344 • 17 google/frames-benchmark Viewer • Updated Oct 15, 2024 • 824 • 4.17k • 214 gaia-benchmark/GAIA Updated Feb 13 • 9.61k • 383 HuggingFaceH4/MATH-500 Viewer • Updated Nov 15, 2024 • 500 • 64.3k • 161
Agent tuning THUDM/SWE-Dev-train Viewer • Updated 6 days ago • 20.1k • 176 • 6 SWE-Gym/OpenHands-SFT-Trajectories Viewer • Updated May 10 • 491 • 289 • 10 lmarena-ai/webdev-arena-preference-10k Viewer • Updated Mar 10 • 10.5k • 312 • 10 SWE-bench/SWE-smith-trajectories Viewer • Updated May 9 • 5.02k • 1.01k • 18
Agent Benchmarks xw27/scibench Viewer • Updated May 6, 2024 • 692 • 344 • 17 google/frames-benchmark Viewer • Updated Oct 15, 2024 • 824 • 4.17k • 214 gaia-benchmark/GAIA Updated Feb 13 • 9.61k • 383 HuggingFaceH4/MATH-500 Viewer • Updated Nov 15, 2024 • 500 • 64.3k • 161