arxiv:2509.04011

NER Retriever: Zero-Shot Named Entity Retrieval with Type-Aware Embeddings

Published on Sep 4

· Submitted by

Authors:

Abstract

NER Retriever uses internal representations from large language models to perform zero-shot named entity retrieval by embedding entity mentions and type descriptions into a shared semantic space, outperforming lexical and dense sentence-level retrieval methods.

AI-generated summary

We present NER Retriever, a zero-shot retrieval framework for ad-hoc Named Entity Retrieval, a variant of Named Entity Recognition (NER), where the types of interest are not provided in advance, and a user-defined type description is used to retrieve documents mentioning entities of that type. Instead of relying on fixed schemas or fine-tuned models, our method builds on internal representations of large language models (LLMs) to embed both entity mentions and user-provided open-ended type descriptions into a shared semantic space. We show that internal representations, specifically the value vectors from mid-layer transformer blocks, encode fine-grained type information more effectively than commonly used top-layer embeddings. To refine these representations, we train a lightweight contrastive projection network that aligns type-compatible entities while separating unrelated types. The resulting entity embeddings are compact, type-aware, and well-suited for nearest-neighbor search. Evaluated on three benchmarks, NER Retriever significantly outperforms both lexical and dense sentence-level retrieval baselines. Our findings provide empirical support for representation selection within LLMs and demonstrate a practical solution for scalable, schema-free entity retrieval. The NER Retriever Codebase is publicly available at https://github.com/ShacharOr100/ner_retriever

View arXiv page View PDF GitHub 10 Add to collection

Community

Uri-ka

Paper submitter 2 days ago

NER Retriever, a zero-shot retrieval framework for ad-hoc Named Entity Retrieval with Type-Aware Embeddings and a lightweight contrastive projection network