Papers
arxiv:2504.20769

Chain-of-Defensive-Thought: Structured Reasoning Elicits Robustness in Large Language Models against Reference Corruption

Published on Apr 29
Β· Submitted by wangwenxiao on Apr 30
Authors:
,
,

Abstract

Chain-of-thought prompting has demonstrated great success in facilitating the reasoning abilities of large language models. In this work, we explore how these enhanced reasoning abilities can be exploited to improve the robustness of large language models in tasks that are not necessarily reasoning-focused. In particular, we show how a wide range of large language models exhibit significantly improved robustness against reference corruption using a simple method called chain-of-defensive-thought, where only a few exemplars with structured and defensive reasoning are provided as demonstrations. Empirically, the improvements can be astounding, especially given the simplicity and applicability of the method. For example, in the Natural Questions task, the accuracy of GPT-4o degrades from 60% to as low as 3% with standard prompting when 1 out of 10 references provided is corrupted with prompt injection attacks. In contrast, GPT-4o using chain-of-defensive-thought prompting maintains an accuracy of 50%.

Community

πŸ›‘οΈ Using Reasoning LLMs for Reliability

The world is investing heavily in reasoning LLMs β€” but πŸ€” how can it help tasks that aren’t reasoning-intensive?

One angle:
Reasoning abilities (of LLMs) can be exploited for reliability!

We explored this and it’s surprisingly easy & surprisingly effective!
πŸ”— Read the paper


πŸ“š Background

LLMs are naturally limited in up-to-date or specialized knowledge.
That’s why so many β€” including OpenAI and Google β€” augment them with external references (e.g., RAG, search, deep research).

However, when those references are compromised, LLM performance can break down β€” raising serious reliability concerns:


🧠 Introducing Chain-of-Defensive-Thought

We propose a simple, prompting-only method called Chain-of-Defensive-Thought to enhance LLM robustness against corrupted external references.

  • No fine-tuning needed
  • Just a few exemplars with structured, defensive reasoning

Illustration:

illustration_horizontal.png


πŸ“ˆ Key Results

Despite its simplicity, Chain-of-Defensive-Thought significantly improves LLM robustness across a wide range of models!

robustness_overall_for_tweet.png


πŸš€ Why It Matters

  • Simple: Just prompting β€” no architecture changes.
  • Effective: Major reliability improvements.
  • Timely: Perfect for boosting systems based on RAG, search augmentation, and retrieval pipelines.

This could open up exciting new research directions with the rise of reasoning-optimized LLMs (e.g., OpenAI's o-series, DeepSeek R1). Thoughts?

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2504.20769 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2504.20769 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2504.20769 in a Space README.md to link it from this page.

Collections including this paper 1