NeuralOS: Towards Simulating Operating Systems via Neural Generative Models Paper • 2507.08800 • Published 5 days ago • 59
GraphOmni: A Comprehensive and Extendable Benchmark Framework for Large Language Models on Graph-theoretic Tasks Paper • 2504.12764 • Published Apr 17 • 42
REARANK: Reasoning Re-ranking Agent via Reinforcement Learning Paper • 2505.20046 • Published May 26 • 18
Paper2Poster: Towards Multimodal Poster Automation from Scientific Papers Paper • 2505.21497 • Published May 27 • 105
Fixing Data That Hurts Performance: Cascading LLMs to Relabel Hard Negatives for Robust Information Retrieval Paper • 2505.16967 • Published May 22 • 23
Pixel Reasoner: Incentivizing Pixel-Space Reasoning with Curiosity-Driven Reinforcement Learning Paper • 2505.15966 • Published May 21 • 53
QuickVideo: Real-Time Long Video Understanding with System Algorithm Co-Design Paper • 2505.16175 • Published May 22 • 42
General-Reasoner Collection Advancing LLMs' general reasoning capabilities • 9 items • Updated 21 days ago • 4
VideoEval-Pro: Robust and Realistic Long Video Understanding Evaluation Paper • 2505.14640 • Published May 20 • 15
General-Reasoner: Advancing LLM Reasoning Across All Domains Paper • 2505.14652 • Published May 20 • 23
Tevatron 2.0: Unified Document Retrieval Toolkit across Scale, Language, and Modality Paper • 2505.02466 • Published May 5 • 1
Document Screenshot Retrievers are Vulnerable to Pixel Poisoning Attacks Paper • 2501.16902 • Published Jan 28 • 1
DRAMA: Diverse Augmentation from Large Language Models to Smaller Dense Retrievers Paper • 2502.18460 • Published Feb 25 • 3
ScholarCopilot: Training Large Language Models for Academic Writing with Accurate Citations Paper • 2504.00824 • Published Apr 1 • 44
VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning Paper • 2504.08837 • Published Apr 10 • 43
DRAMA Collection A collection of small (sub-1B) multilingual dense retrievers that generalize well across a number of tasks and languages. • 3 items • Updated Feb 26 • 7