Attention Basin: Why Contextual Position Matters in Large Language Models
Abstract
The performance of Large Language Models (LLMs) is significantly sensitive to the contextual position of information in the input. To investigate the mechanism behind this positional bias, our extensive experiments reveal a consistent phenomenon we term the attention basin: when presented with a sequence of structured items (e.g., retrieved documents or few-shot examples), models systematically assign higher attention to the items at the beginning and end of the sequence, while neglecting those in the middle. Crucially, our analysis further reveals that allocating higher attention to critical information is key to enhancing model performance. Based on these insights, we introduce Attention-Driven Reranking (AttnRank), a two-stage framework that (i) estimates a model's intrinsic positional attention preferences using a small calibration set, and (ii) reorders retrieved documents or few-shot examples to align the most salient content with these high-attention positions. AttnRank is a model-agnostic, training-free, and plug-and-play method with minimal computational overhead. Experiments on multi-hop QA and few-shot in-context learning tasks demonstrate that AttnRank achieves substantial improvements across 10 large language models of varying architectures and scales, without modifying model parameters or training procedures.
Community
This work presents a compelling and meticulously documented exploration of a subtle yet significant challenge in LLM behavior: positional bias. The identification and characterization of the "attention basin" phenomenon – the systematic tendency of models to focus more on the beginning and end of structured input sequences while neglecting the middle – feels like uncovering a fundamental, previously underappreciated quirk in how these models process information. It resonates strongly with practical observations many have likely encountered but perhaps not formally understood. The crucial link established between heightened attention allocation and improved model performance provides a clear motivation for intervention.
The proposed solution, AttnRank, is particularly appealing for its elegance and practicality. Its model-agnostic, training-free, and plug-and-play nature addresses a real need in the community for lightweight, adaptable techniques that don't require extensive retraining or parameter modification. The principle of leveraging a model's own inferred positional preferences, calibrated minimally, to strategically reorder inputs is clever and intuitive. The reported improvements across such a diverse range of 10 LLMs, spanning architectures and scales, are quietly impressive and strongly support the method's robustness and general applicability to both retrieval-augmented tasks and few-shot learning.
This research feels like a valuable step towards demystifying LLM internals related to context utilization. By clearly demonstrating the performance impact of simple positional reordering guided by inherent attention patterns, it offers a concrete, readily implementable tool. While it doesn't claim to solve all context sensitivity issues, it provides a principled, effective approach grounded in solid empirical analysis. The work deserves attention for its insightful diagnosis of the "attention basin" and its delivery of a practical, efficient remedy with demonstrable, widespread benefits. It’s a useful contribution that enhances our understanding.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Exploiting Primacy Effect To Improve Large Language Models (2025)
- Where to show Demos in Your Prompt: A Positional Bias of In-Context Learning (2025)
- Small Encoders Can Rival Large Decoders in Detecting Groundedness (2025)
- Beyond Isolated Capabilities: Bridging Long CoT Reasoning and Long-Context Understanding (2025)
- Causal2Vec: Improving Decoder-only LLMs as Versatile Embedding Models (2025)
- TransPrune: Token Transition Pruning for Efficient Large Vision-Language Model (2025)
- Evaluating Position Bias in Large Language Model Recommendations (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper