SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines Paper โข 2502.14739 โข Published 5 days ago โข 91
UI-TARS: Pioneering Automated GUI Interaction with Native Agents Paper โข 2501.12326 โข Published Jan 21 โข 51
UI-TARS: Pioneering Automated GUI Interaction with Native Agents Paper โข 2501.12326 โข Published Jan 21 โข 51
Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training Paper โข 2501.11425 โข Published Jan 20 โข 91
ProAgent: From Robotic Process Automation to Agentic Process Automation Paper โข 2311.10751 โข Published Nov 2, 2023 โข 10
ML-Bench: Large Language Models Leverage Open-source Libraries for Machine Learning Tasks Paper โข 2311.09835 โข Published Nov 16, 2023 โข 11
ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs Paper โข 2307.16789 โข Published Jul 31, 2023 โข 99
ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs Paper โข 2307.16789 โข Published Jul 31, 2023 โข 99
Exploring Format Consistency for Instruction Tuning Paper โข 2307.15504 โข Published Jul 28, 2023 โข 8