Friends and Grandmothers in Silico: Localizing Entity Cells in Language Models
Abstract
Entity-centric factual question answering involves localized MLP neurons that can be causally intervened to recover entity-consistent predictions, showing robustness to various linguistic variations but with limited universality across all entities.
Language models can answer many entity-centric factual questions, but it remains unclear which internal mechanisms are involved in this process. We study this question across multiple language models. We localize entity-selective MLP neurons using templated prompts about each entity, and then validate them with causal interventions on PopQA-based QA examples. On a curated set of 200 entities drawn from PopQA, localized neurons concentrate in early layers. Negative ablation produces entity-specific amnesia, while controlled injection at a placeholder token improves answer retrieval relative to mean-entity and wrong-cell controls. For many entities, activating a single localized neuron is sufficient to recover entity-consistent predictions once the context is initialized, consistent with compact entity retrieval rather than purely gradual enrichment across depth. Robustness to aliases, acronyms, misspellings, and multilingual forms supports a canonicalization interpretation. The effect is strong but not universal: not every entity admits a reliable single-neuron handle, and coverage is higher for popular entities. Overall, these results identify sparse, causally actionable access points for analyzing and modulating entity-conditioned factual behavior.
Community
.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- The Anatomy of an Edit: Mechanism-Guided Activation Steering for Knowledge Editing (2026)
- Tracing Pharmacological Knowledge In Large Language Models (2026)
- Do Language Models Encode Semantic Relations? Probing and Sparse Feature Analysis (2026)
- A Universal Vibe? Finding and Controlling Language-Agnostic Informal Register with SAEs (2026)
- Finding Highly Interpretable Prompt-Specific Circuits in Language Models (2026)
- Word Recovery in Large Language Models Enables Character-Level Tokenization Robustness (2026)
- Panini: Continual Learning in Token Space via Structured Memory (2026)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend
Get this paper in your agent:
hf papers read 2604.01404 Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper