Collapse of Dense Retrievers: Short, Early, and Literal Biases Outranking Factual Evidence Paper • 2503.05037 • Published Mar 6 • 4
M-ABSA: A Multilingual Dataset for Aspect-Based Sentiment Analysis Paper • 2502.11824 • Published Feb 17
Understanding In-Context Machine Translation for Low-Resource Languages: A Case Study on Manchu Paper • 2502.11862 • Published Feb 17
GlotCC: An Open Broad-Coverage CommonCrawl Corpus and Pipeline for Minority Languages Paper • 2410.23825 • Published Oct 31, 2024 • 4
LangSAMP: Language-Script Aware Multilingual Pretraining Paper • 2409.18199 • Published Sep 26, 2024 • 1
MEXA: Multilingual Evaluation of English-Centric LLMs via Cross-Lingual Alignment Paper • 2410.05873 • Published Oct 8, 2024 • 3
MemLLM: Finetuning LLMs to Use An Explicit Read-Write Memory Paper • 2404.11672 • Published Apr 17, 2024
Consistent Document-Level Relation Extraction via Counterfactuals Paper • 2407.06699 • Published Jul 9, 2024