arxiv:2508.12800

Atom-Searcher: Enhancing Agentic Deep Research via Fine-Grained Atomic Thought Reward

Published on Aug 18

· Submitted by

dikw on Aug 20

Upvote

Authors:

Yong Deng ,

Abstract

Atom-Searcher, an RL framework integrating Atomic Thought and Reasoning Reward Models, enhances LLMs' multi-hop reasoning and strategic search capabilities, improving performance and interpretability.

AI-generated summary

Large language models (LLMs) exhibit remarkable problem-solving abilities, but struggle with complex tasks due to static internal knowledge. Retrieval-Augmented Generation (RAG) enhances access to external information, yet remains limited in multi-hop reasoning and strategic search due to rigid workflows. Recent advancements in agentic deep research empower LLMs to autonomously reason, search, and synthesize information. However, current approaches relying on outcome-based reinforcement learning (RL) face critical issues such as conflicting gradients and reward sparsity, limiting performance gains and training efficiency. To address these, we first propose Atomic Thought, a novel LLM thinking paradigm that decomposes reasoning into fine-grained functional units. These units are supervised by Reasoning Reward Models (RRMs), which provide Atomic Thought Rewards (ATR) for fine-grained guidance. Building on this, we propose Atom-Searcher, a novel RL framework for agentic deep research that integrates Atomic Thought and ATR. Atom-Searcher uses a curriculum-inspired reward schedule, prioritizing process-level ATR early and transitioning to outcome rewards, accelerating convergence on effective reasoning paths. Experiments on seven benchmarks show consistent improvements over the state-of-the-art. Key advantages include: (1) Atom-Searcher scales computation at test-time. (2) Atomic Thought provides supervision anchors for RRMs, bridging deep research tasks and RRMs. (3) Atom-Searcher exhibits more interpretable, human-like reasoning patterns.

View arXiv page View PDF Add to collection

Community

dikw

Paper author Paper submitter about 2 hours ago

Atom-Searcher is a novel framework designed to enhance the deep research capabilities of Large Language Models (LLMs). While LLMs show great promise, their static internal knowledge limits their ability to handle complex, multi-step tasksExisting methods like Retrieval-Augmented Generation (RAG) and outcome-based reinforcement learning (RL) often fall short due to rigid workflows, reward sparsity, and conflicting gradients during training.

To overcome these challenges, we introduce Atom-Searcher, a new reinforcement learning framework built on the concept of Atomic Thought. This paradigm decomposes complex reasoning into fine-grained, functional units. Each "atomic thought" is evaluated by a Reasoning Reward Model (RRM), providing a fine-grained Atomic Thought Reward (ATR) that guides the agent's learning process.

The framework uses a curriculum-inspired reward schedule that initially prioritizes high-quality reasoning processes before shifting focus to final outcomes, which accelerates the discovery of effective problem-solving strategies.

Key advantages of Atom-Searcher include:

State-of-the-Art Performance: Achieves consistent improvements over existing models on seven different benchmarks.
Enhanced Interpretability: Exhibits more human-like and understandable reasoning patterns by breaking down its thought process.
Efficient Training: Mitigates issues of reward sparsity and gradient conflicts, leading to more efficient policy optimization.
Scalable Computation: Effectively scales its computational efforts during test-time to tackle more complex queries.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 1

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2508.12800 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2508.12800 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.