X-Teaming: Multi-Turn Jailbreaks and Defenses with Adaptive Multi-Agents Paper • 2504.13203 • Published 7 days ago • 20
ChartQAPro: A More Diverse and Challenging Benchmark for Chart Question Answering Paper • 2504.05506 • Published 15 days ago • 20
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling Paper • 2502.06703 • Published Feb 10 • 151
MapCoder: Multi-Agent Code Generation for Competitive Problem Solving Paper • 2405.11403 • Published May 18, 2024 • 3
CODESIM: Multi-Agent Code Generation and Problem Solving through Simulation-Driven Planning and Debugging Paper • 2502.05664 • Published Feb 8 • 23 • 3
CODESIM: Multi-Agent Code Generation and Problem Solving through Simulation-Driven Planning and Debugging Paper • 2502.05664 • Published Feb 8 • 23
CODESIM: Multi-Agent Code Generation and Problem Solving through Simulation-Driven Planning and Debugging Paper • 2502.05664 • Published Feb 8 • 23
MapCoder: Multi-Agent Code Generation for Competitive Problem Solving Paper • 2405.11403 • Published May 18, 2024 • 3