arxiv:2505.24760

REASONING GYM: Reasoning Environments for Reinforcement Learning with Verifiable Rewards

Published on May 30

· Submitted by

zafstojano on Jun 3

#3 Paper of the day

Upvote

Authors:

Zafir Stojanovski ,

Oliver Stanley ,

Abdulhakeem Adefioye ,

Jean Kaddour ,

Abstract

Reasoning Gym provides a library of reasoning environments with verifiable rewards and procedural data generation for reinforcement learning, enabling the evaluation and training of reasoning models at varying difficulty levels.

AI-generated summary

We introduce Reasoning Gym (RG), a library of reasoning environments for reinforcement learning with verifiable rewards. It provides over 100 data generators and verifiers spanning multiple domains including algebra, arithmetic, computation, cognition, geometry, graph theory, logic, and various common games. Its key innovation is the ability to generate virtually infinite training data with adjustable complexity, unlike most previous reasoning datasets, which are typically fixed. This procedural generation approach allows for continuous evaluation across varying difficulty levels. Our experimental results demonstrate the efficacy of RG in both evaluating and reinforcement learning of reasoning models.

View arXiv page View PDF GitHub repository Add to collection

Community

zafstojano

Paper author Paper submitter 2 days ago

Super excited to share our open source library Reasoning Gym!

We provide over 100 data generators and verifiers spanning several domains (algebra, arithmetic, code, geometry, logic, games) for training the next generation of reasoning models.
In essence, we can generate an infinite amount of data, whose labels can be algorithmically and automatically verified.
This allows for the possibility of significantly scaling the Reinforcement Learning phase of training without being constrained by a lack of high-quality labeled data.

CameronNLP

2 days ago

Your paper looks great! I'll try some experiments with the reasoning gym this week, thank you for your work

Maybe you already saw it, but I found this RL paper from nvidia, seems like they used the gym to generate training data: https://arxiv.org/abs/2505.24864

zafstojano

Paper author Paper submitter 2 days ago

Hey, really glad you like our work, contributions to our open source repo are more than welcome!

Regarding the RL paper from NVIDIA: it's really cool to see the library being validated by other groups, and we hope it will come in handy to future research as well!