Submitted by Nicolas-BZRD 63 Should We Still Pretrain Encoders with Masked Language Modeling? · 8 authors 60 7
Submitted by JunhaoZhuang 33 4DSloMo: 4D Reconstruction for High Speed Scene with Asynchronous Capture · 7 authors 2
Submitted by RunpeiDong 30 DreamVLA: A Vision-Language-Action Model Dreamed with Comprehensive World Knowledge · 13 authors 31 1
Submitted by RowitZou 28 Pre-Trained Policy Discriminators are General Reward Models · 22 authors 46 1
Submitted by RTT1 23 Agent KB: Leveraging Cross-Domain Experience for Agentic Problem Solving · 18 authors 3
Submitted by KYLN24 22 BMMR: A Large-Scale Bilingual Multimodal Multi-Discipline Reasoning Dataset · 15 authors 6 1
Submitted by hiyouga 15 Easy Dataset: A Unified and Extensible Framework for Synthesizing LLM Fine-Tuning Data from Unstructured Documents · 7 authors 9.23k 1
Submitted by Bibaolong 15 RefineX: Learning to Refine Pre-training Data at Scale from Expert-Guided Programs · 10 authors 1
Submitted by justinyyy 12 OmniDraft: A Cross-vocabulary, Online Adaptive Drafter for On-device Speculative Decoding · 7 authors 1
Submitted by ZZXF 11 Reviving Cultural Heritage: A Novel Approach for Comprehensive Historical Document Restoration · 8 authors 26 1
Submitted by ai-hyz 10 Evaluating Memory in LLM Agents via Incremental Multi-Turn Interactions · 3 authors 6 2
Submitted by xxzcc 8 ArtifactsBench: Bridging the Visual-Interactive Gap in LLM Code Generation Evaluation · 32 authors 1
Submitted by ziyjiang 6 VLM2Vec-V2: Advancing Multimodal Embedding for Videos, Images, and Visual Documents · 13 authors 294 1
Submitted by SteveZeyuZhang 6 PresentAgent: Multimodal Agent for Presentation Video Generation · 7 authors 16 1
Submitted by cedricbonhomme 5 VLAI: A RoBERTa-Based Model for Automated Vulnerability Severity Classification · 2 authors 12 1
Submitted by danielchyeh 4 Beyond Simple Edits: X-Planner for Complex Instruction-Based Image Editing · 7 authors 1
Submitted by Johnyquest7 4 Preserving Privacy, Increasing Accessibility, and Reducing Cost: An On-Device Artificial Intelligence Model for Medical Transcription and Note Generation · 6 authors 1
Submitted by ashutosh1919 3 Disambiguation-Centric Finetuning Makes Enterprise Tool-Calling LLMs More Realistic and Less Risky · 3 authors 1
Submitted by jannalu 2 Evaluating LLMs on Real-World Forecasting Against Human Superforecasters · 1 authors 2
Submitted by amanchadha 2 MOD-X: A Modular Open Decentralized eXchange Framework proposal for Heterogeneous Interoperable Artificial Agents · 5 authors 1