SEAgent: Self-Evolving Computer Use Agent with Autonomous Learning from Experience
Abstract
SEAgent, an agentic self-evolving framework, enables computer-use agents to autonomously master novel software through experiential learning and a curriculum of tasks, achieving superior performance compared to existing methods.
Repurposing large vision-language models (LVLMs) as computer use agents (CUAs) has led to substantial breakthroughs, primarily driven by human-labeled data. However, these models often struggle with novel and specialized software, particularly in scenarios lacking human annotations. To address this challenge, we propose SEAgent, an agentic self-evolving framework enabling CUAs to autonomously evolve through interactions with unfamiliar software. Specifically, SEAgent empowers computer-use agents to autonomously master novel software environments via experiential learning, where agents explore new software, learn through iterative trial-and-error, and progressively tackle auto-generated tasks organized from simple to complex. To achieve this goal, we design a World State Model for step-wise trajectory assessment, along with a Curriculum Generator that generates increasingly diverse and challenging tasks. The agent's policy is updated through experiential learning, comprised of adversarial imitation of failure actions and Group Relative Policy Optimization (GRPO) on successful ones. Furthermore, we introduce a specialist-to-generalist training strategy that integrates individual experiential insights from specialist agents, facilitating the development of a stronger generalist CUA capable of continuous autonomous evolution. This unified agent ultimately achieves performance surpassing ensembles of individual specialist agents on their specialized software. We validate the effectiveness of SEAgent across five novel software environments within OS-World. Our approach achieves a significant improvement of 23.2% in success rate, from 11.3% to 34.5%, over a competitive open-source CUA, i.e., UI-TARS.
Community
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- SEEA-R1: Tree-Structured Reinforcement Fine-Tuning for Self-Evolving Embodied Agents (2025)
- SEA: Self-Evolution Agent with Step-wise Reward for Computer Use (2025)
- VIKI-R: Coordinating Embodied Multi-Agent Cooperation via Reinforcement Learning (2025)
- Unleashing Embodied Task Planning Ability in LLMs via Reinforcement Learning (2025)
- LaViPlan : Language-Guided Visual Path Planning with RLVR (2025)
- Beyond Policy Optimization: A Data Curation Flywheel for Sparse-Reward Long-Horizon Planning (2025)
- GHPO: Adaptive Guidance for Stable and Efficient LLM Reinforcement Learning (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 2
Datasets citing this paper 1
Spaces citing this paper 0
No Space linking this paper