Submitted by xuchensong 46 Skywork R1V2: Multimodal Hybrid Reinforcement Learning for Reasoning · 13 authors 2
Submitted by hongyuw 35 BitNet v2: Native 4-bit Activations with Hadamard Transformation for 1-bit LLMs · 3 authors 2
Submitted by YunxinLi 21 VideoVista-CulturalLingo: 360^circ Horizons-Bridging Cultures, Languages, and Domains in Video Comprehension · 7 authors 2
Submitted by HanleiZhang 14 Can Large Language Models Help Multimodal Language Analysis? MMLA: A Comprehensive Benchmark · 8 authors 2
Submitted by alemiaschi 12 Optimizing LLMs for Italian: Reducing Token Fertility and Enhancing Efficiency Through Vocabulary Adaptation · 9 authors 1
Submitted by pnawrot 11 The Sparse Frontier: Sparse Attention Trade-offs in Transformer LLMs · 6 authors 3
Submitted by carpedkm 11 Subject-driven Video Generation via Disentangled Identity and Motion · 7 authors 2
Submitted by amazingj 7 DianJin-R1: Evaluating and Enhancing Financial Reasoning in Large Language Models · 7 authors 2
Submitted by zaplm 7 DC-SAM: In-Context Segment Anything in Images and Videos via Dual Consistency · 7 authors 2
Submitted by Pclanglais 6 Even Small Reasoners Should Quote Their Sources: Introducing the Pleias-RAG Model Family · 9 authors 2