Abstract
Thinkless enables LLMs to adaptively choose between short and long reasoning by using control tokens, reducing computational inefficiencies on benchmarks.
Reasoning Language Models, capable of extended chain-of-thought reasoning, have demonstrated remarkable performance on tasks requiring complex logical inference. However, applying elaborate reasoning for all queries often results in substantial computational inefficiencies, particularly when many problems admit straightforward solutions. This motivates an open question: Can LLMs learn when to think? To answer this, we propose Thinkless, a learnable framework that empowers an LLM to adaptively select between short-form and long-form reasoning, based on both task complexity and the model's ability. Thinkless is trained under a reinforcement learning paradigm and employs two control tokens, <short> for concise responses and <think> for detailed reasoning. At the core of our method is a Decoupled Group Relative Policy Optimization (DeGRPO) algorithm, which decomposes the learning objective of hybrid reasoning into two components: (1) a control token loss that governs the selection of the reasoning mode, and (2) a response loss that improves the accuracy of the generated answers. This decoupled formulation enables fine-grained control over the contributions of each objective, stabilizing training and effectively preventing collapse observed in vanilla GRPO. Empirically, on several benchmarks such as Minerva Algebra, MATH-500, and GSM8K, Thinkless is able to reduce the usage of long-chain thinking by 50% - 90%, significantly improving the efficiency of Reasoning Language Models. The code is available at https://github.com/VainF/Thinkless
Community
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Learning When to Think: Shaping Adaptive Reasoning in R1-Style Models via Multi-Stage RL (2025)
- ThinkSwitcher: When to Think Hard, When to Think Fast (2025)
- ShorterBetter: Guiding Reasoning Models to Find Optimal Inference Length for Efficient Reasoning (2025)
- Think Only When You Need with Large Hybrid-Reasoning Models (2025)
- SelfBudgeter: Adaptive Token Allocation for Efficient LLM Reasoning (2025)
- Making Small Language Models Efficient Reasoners: Intervention, Supervision, Reinforcement (2025)
- Scalable Chain of Thoughts via Elastic Reasoning (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 2
Datasets citing this paper 1
Spaces citing this paper 0
No Space linking this paper