AgentSynth: Scalable Task Generation for Generalist Computer-Use Agents Paper • 2506.14205 • Published Jun 17 • 7
MLR-Bench: Evaluating AI Agents on Open-Ended Machine Learning Research Paper • 2505.19955 • Published May 26 • 12
HardTests: Synthesizing High-Quality Test Cases for LLM Coding Paper • 2505.24098 • Published May 30 • 44
RenderFormer: Transformer-based Neural Rendering of Triangle Meshes with Global Illumination Paper • 2505.21925 • Published May 28 • 36