Safetensors
qwen2
LRM
hybrid_reasoning
efficient_reasoning

AdaptThink: LLM Can Learn When to Think

🤗 HF Collections • 💻 Github Repo • 📃 Paper

🔍 Table of Contents

🤖️ AdaptThink

We present AdapThink, a novel reinforcement learning (RL) algorithm that enables reasoning models to adaptively choose between Thinking and NoThinking modes according to the difficulty of each input problem, thereby achieving automatic hybrid reasoning. Specifically, the model engages in thinking only when the problem is determined to be challenging; for other simple question, it will bypass the thinking process and directly produce a concise final solution. This approach substantially reduces inference costs while further improving overall performance.

image/png

⚙️ Released Models

All Available Datasets and Models

We apply the AdaptThink algorithm on DeepSeek-R1-Distill-Qwen-1.5B with $\delta$ from 0 to 0.1, and DeepSeek-R1-Distill-Qwen-7B with $\delta=0.05$. A larger $\large$ results in a higher proportion of NoThinking responses, which reduces more inference costs but also diminish the resultant improvement in accuracy.

All the trained models are available on HuggingFace.

Name HF Repo
AdaptThink-1.5B-delta0 🤗 HF Repo
AdaptThink-1.5B-delta0.01 🤗 HF Repo
AdaptThink-1.5B-delta0.02 🤗 HF Repo
AdaptThink-1.5B-delta0.05 🤗 HF Repo
AdaptThink-1.5B-delta0.075 🤗 HF Repo
AdaptThink-1.5B-delta0.1 🤗 HF Repo
AdaptThink-7B-delta0.05 🤗 HF Repo

📊 Evaluation Results

We list our evaluation results as follows:

1. Comparison with existing methods for efficient reasoning on mathematics datasets

image/png

2. Nothinking responses ratio and accuracy across different difficulty levels on MATH500

image/png

3. Comparison of different $\delta$ values

image/png

4. Evaluation results on MMLU
image

📝 Citation

If you find our work useful, please consider citing LongReward:

@article{zhang2025adapt_think,
  title = {AdaptThink: LLM Can Learn When to Think} 
  author={Jiajie Zhang and Nianyi Lin and Lei Hou and Ling Feng and Juanzi Li},
  journal={arXiv preprint arXiv: 2505.13417},
  url={https://arxiv.org/abs/2505.13417}
  year={2025}
}
Downloads last month
4
Safetensors
Model size
1.78B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for THU-KEG/AdaptThink-1.5B-delta0.1

Finetuned
(325)
this model

Dataset used to train THU-KEG/AdaptThink-1.5B-delta0.1

Collection including THU-KEG/AdaptThink-1.5B-delta0.1