Papers
arxiv:2508.05128

Attention Basin: Why Contextual Position Matters in Large Language Models

Published on Aug 7
· Submitted by Zihao1 on Aug 8
Authors:
,
,
,
,
,
,
,
,

Abstract

The performance of Large Language Models (LLMs) is significantly sensitive to the contextual position of information in the input. To investigate the mechanism behind this positional bias, our extensive experiments reveal a consistent phenomenon we term the attention basin: when presented with a sequence of structured items (e.g., retrieved documents or few-shot examples), models systematically assign higher attention to the items at the beginning and end of the sequence, while neglecting those in the middle. Crucially, our analysis further reveals that allocating higher attention to critical information is key to enhancing model performance. Based on these insights, we introduce Attention-Driven Reranking (AttnRank), a two-stage framework that (i) estimates a model's intrinsic positional attention preferences using a small calibration set, and (ii) reorders retrieved documents or few-shot examples to align the most salient content with these high-attention positions. AttnRank is a model-agnostic, training-free, and plug-and-play method with minimal computational overhead. Experiments on multi-hop QA and few-shot in-context learning tasks demonstrate that AttnRank achieves substantial improvements across 10 large language models of varying architectures and scales, without modifying model parameters or training procedures.

Community

Paper submitter

This work presents a compelling and meticulously documented exploration of a subtle yet significant challenge in LLM behavior: positional bias. The identification and characterization of the "attention basin" phenomenon – the systematic tendency of models to focus more on the beginning and end of structured input sequences while neglecting the middle – feels like uncovering a fundamental, previously underappreciated quirk in how these models process information. It resonates strongly with practical observations many have likely encountered but perhaps not formally understood. The crucial link established between heightened attention allocation and improved model performance provides a clear motivation for intervention.

The proposed solution, AttnRank, is particularly appealing for its elegance and practicality. Its model-agnostic, training-free, and plug-and-play nature addresses a real need in the community for lightweight, adaptable techniques that don't require extensive retraining or parameter modification. The principle of leveraging a model's own inferred positional preferences, calibrated minimally, to strategically reorder inputs is clever and intuitive. The reported improvements across such a diverse range of 10 LLMs, spanning architectures and scales, are quietly impressive and strongly support the method's robustness and general applicability to both retrieval-augmented tasks and few-shot learning.

This research feels like a valuable step towards demystifying LLM internals related to context utilization. By clearly demonstrating the performance impact of simple positional reordering guided by inherent attention patterns, it offers a concrete, readily implementable tool. While it doesn't claim to solve all context sensitivity issues, it provides a principled, effective approach grounded in solid empirical analysis. The work deserves attention for its insightful diagnosis of the "attention basin" and its delivery of a practical, efficient remedy with demonstrable, widespread benefits. It’s a useful contribution that enhances our understanding.

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2508.05128 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2508.05128 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2508.05128 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.