Scaling Language-Free Visual Representation Learning Paper • 2504.01017 • Published 1 day ago • 12
Linguini: A benchmark for language-agnostic linguistic reasoning Paper • 2409.12126 • Published Sep 18, 2024
LCFO: Long Context and Long Form Output Dataset and Benchmarking Paper • 2412.08268 • Published Dec 11, 2024
Large Concept Models: Language Modeling in a Sentence Representation Space Paper • 2412.08821 • Published Dec 11, 2024 • 14
BOUQuET: dataset, Benchmark and Open initiative for Universal Quality Evaluation in Translation Paper • 2502.04314 • Published Feb 6
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution Paper • 2502.18449 • Published Feb 25 • 71
CLUTRR: A Diagnostic Benchmark for Inductive Reasoning from Text Paper • 1908.06177 • Published Aug 16, 2019
Learning an Unreferenced Metric for Online Dialogue Evaluation Paper • 2005.00583 • Published May 1, 2020
How sensitive are translation systems to extra contexts? Mitigating gender bias in Neural Machine Translation models through relevant contexts Paper • 2205.10762 • Published May 22, 2022
MetaMorph: Multimodal Understanding and Generation via Instruction Tuning Paper • 2412.14164 • Published Dec 18, 2024 • 4