REASONING GYM: Reasoning Environments for Reinforcement Learning with Verifiable Rewards
Abstract
Reasoning Gym provides a library of reasoning environments with verifiable rewards and procedural data generation for reinforcement learning, enabling the evaluation and training of reasoning models at varying difficulty levels.
We introduce Reasoning Gym (RG), a library of reasoning environments for reinforcement learning with verifiable rewards. It provides over 100 data generators and verifiers spanning multiple domains including algebra, arithmetic, computation, cognition, geometry, graph theory, logic, and various common games. Its key innovation is the ability to generate virtually infinite training data with adjustable complexity, unlike most previous reasoning datasets, which are typically fixed. This procedural generation approach allows for continuous evaluation across varying difficulty levels. Our experimental results demonstrate the efficacy of RG in both evaluating and reinforcement learning of reasoning models.
Community
Super excited to share our open source library Reasoning Gym
!
We provide over 100 data generators and verifiers spanning several domains (algebra, arithmetic, code, geometry, logic, games) for training the next generation of reasoning models.
In essence, we can generate an infinite amount of data, whose labels can be algorithmically and automatically verified.
This allows for the possibility of significantly scaling the Reinforcement Learning phase of training without being constrained by a lack of high-quality labeled data.
Your paper looks great! I'll try some experiments with the reasoning gym this week, thank you for your work
Maybe you already saw it, but I found this RL paper from nvidia, seems like they used the gym to generate training data: https://arxiv.org/abs/2505.24864
Hey, really glad you like our work, contributions to our open source repo are more than welcome!
Regarding the RL paper from NVIDIA: it's really cool to see the library being validated by other groups, and we hope it will come in handy to future research as well!
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Enigmata: Scaling Logical Reasoning in Large Language Models with Synthetic Verifiable Puzzles (2025)
- Nemotron-CrossThink: Scaling Self-Learning beyond Math Reasoning (2025)
- RM-R1: Reward Modeling as Reasoning (2025)
- Learning to Reason without External Rewards (2025)
- General-Reasoner: Advancing LLM Reasoning Across All Domains (2025)
- Reward Reasoning Model (2025)
- KORGym: A Dynamic Game Platform for LLM Reasoning Evaluation (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 1
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper