zafstojano
/

Qwen2.5-3B-Instruct-RG-Math

Text Generation

text-generation-inference

Model card Files Files and versions

This model was trained for our Reasoning Gym paper (https://arxiv.org/abs/2505.24760) using our Reasoning Gym repo (https://github.com/open-thought/reasoning-gym)

Downloads last month: 11

Safetensors

Model size

3B params

Tensor type

BF16

·

Model tree for zafstojano/Qwen2.5-3B-Instruct-RG-Math

Base model

Qwen/Qwen2.5-3B

Finetuned

Qwen/Qwen2.5-3B-Instruct

Finetuned

(1028)

this model

Paper for zafstojano/Qwen2.5-3B-Instruct-RG-Math

REASONING GYM: Reasoning Environments for Reinforcement Learning with Verifiable Rewards

Paper • 2505.24760 • Published May 30, 2025 • 74