OAgents: An Empirical Study of Building Effective Agents
Abstract
Recently, Agentic AI has become an increasingly popular research field. However, we argue that current agent research practices lack standardization and scientific rigor, making it hard to conduct fair comparisons among methods. As a result, it is still unclear how different design choices in agent frameworks affect effectiveness, and measuring their progress remains challenging. In this work, we conduct a systematic empirical study on GAIA benchmark and BrowseComp to examine the impact of popular design choices in key agent components in a fair and rigorous manner. We find that the lack of a standard evaluation protocol makes previous works, even open-sourced ones, non-reproducible, with significant variance between random runs. Therefore, we introduce a more robust evaluation protocol to stabilize comparisons. Our study reveals which components and designs are crucial for effective agents, while others are redundant, despite seeming logical. Based on our findings, we build and open-source OAgents, a new foundation agent framework that achieves state-of-the-art performance among open-source projects. OAgents offers a modular design for various agent components, promoting future research in Agentic AI.
Community
It's an empirical study on building effective agent frameworks, using the GAIA benchmark
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- ManuSearch: Democratizing Deep Search in Large Language Models with a Transparent and Open Multi-Agent Framework (2025)
- WebDancer: Towards Autonomous Information Seeking Agency (2025)
- Scaling Test-time Compute for LLM Agents (2025)
- Xolver: Multi-Agent Reasoning with Holistic Experience Learning Just Like an Olympiad Team (2025)
- Revisiting Multi-Agent Debate as Test-Time Scaling: A Systematic Study of Conditional Effectiveness (2025)
- Agents of Change: Self-Evolving LLM Agents for Strategic Planning (2025)
- MASLab: A Unified and Comprehensive Codebase for LLM-based Multi-Agent Systems (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper