Introduction

Qwen2.5-32B-DialogueReason is a dialogue-based reasoning model built on Qwen2.5-32B-Base.
We train the model using Open-Reasoner-Zero data through rule-based reinforcement learning.

๐Ÿง  Key Features

  • Qwen2.5-32B-Base as the foundation.
  • Use Rule-Based RL to achieve dialogue reasoning.
  • With dynamic agent initialization to adapt to various scenarios.
  • With flexible environment configuration to set up task-specific contexts.
  • With multi-turn dialogue reasoning to incrementally solve problems.
Downloads last month
5
Safetensors
Model size
32.8B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for stepfun-ai/Qwen2.5-32B-DialogueReason

Quantizations
1 model