arxiv:2505.24298

AReaL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning

Published on May 30

· Submitted by

xssstory on Jun 3

Upvote

Authors:

Jiaxuan Gao ,

Zhiyu Mei ,

Abstract

AReaL, a fully asynchronous reinforcement learning system, decouples generation and training to achieve higher GPU utilization and up to 2.57x training speedup for large language models on reasoning tasks.

AI-generated summary

Reinforcement learning (RL) has become a trending paradigm for training large language models (LLMs), particularly for reasoning tasks. Effective RL for LLMs requires massive parallelization and poses an urgent need for efficient training systems. Most existing large-scale RL systems for LLMs are synchronous by alternating generation and training in a batch setting, where the rollouts in each training batch are generated by the same (or latest) model. This stabilizes RL training but suffers from severe system-level inefficiency. Generation must wait until the longest output in the batch is completed before model update, resulting in GPU underutilization. We present AReaL, a fully asynchronous RL system that completely decouples generation from training. Rollout workers in AReaL continuously generate new outputs without waiting, while training workers update the model whenever a batch of data is collected. AReaL also incorporates a collection of system-level optimizations, leading to substantially higher GPU utilization. To stabilize RL training, AReaL balances the workload of rollout and training workers to control data staleness, and adopts a staleness-enhanced PPO variant to better handle outdated training samples. Extensive experiments on math and code reasoning benchmarks show that AReaL achieves up to 2.57times training speedup compared to the best synchronous systems with the same number of GPUs and matched or even improved final performance. The code of AReaL is available at https://github.com/inclusionAI/AReaL/.

View arXiv page View PDF Add to collection

Community

xssstory

Paper submitter 2 days ago

Project Page: https://github.com/inclusionAI/AReaL

AReaL is a Fully Asynchronous RL Training System that incorporates system-algorithm co-design to achieve up to two times higher speedup while maintaining or even improving the final performance. AReaL completely decouples generation from training, leading to substantially higher GPU utilization.