Submitted by chongjie 72 Light of Normals: Unified Feature Representation for Universal Photometric Stereo · 14 authors 106 2
Submitted by mozhu 40 LongWriter-Zero: Mastering Ultra-Long Text Generation via Reinforcement Learning · 5 authors 2
Submitted by michaal94 26 ViDAR: Video Diffusion-Aware 4D Reconstruction From Monocular Inputs · 6 authors 1
Submitted by Lingaaaaaaa 25 ReasonFlux-PRM: Trajectory-Aware PRMs for Long Chain-of-Thought Reasoning in LLMs · 7 authors 414 1
Submitted by ZhuoweiChen 25 Phantom-Data : Towards a General Subject-Consistent Video Generation Dataset · 11 authors 28 2
Submitted by Yirany 25 RLPR: Extrapolating RLVR to General Domains without Verifiers · 12 authors 30 3
Submitted by csuhan 22 Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations · 9 authors 41 1
Submitted by liguang0115 13 VMem: Consistent Interactive Video Scene Generation with Surfel-Indexed View Memory · 4 authors 47 1
Submitted by sgidaris 12 DIP: Unsupervised Dense In-Context Post-training of Visual Representations · 5 authors 1
Submitted by vyokky 8 LettinGo: Explore User Profile Generation for Recommendation System · 12 authors 1
Submitted by ashmrz 8 4Real-Video-V2: Fused View-Time Attention and Feedforward Reconstruction for 4D Scene Generation · 12 authors 1
Submitted by cliang1453 7 SlimMoE: Structured Compression of Large MoE Models via Expert Slimming and Distillation · 7 authors 1
Submitted by manglu3935 7 Enhancing Step-by-Step and Verifiable Medical Reasoning in MLLMs · 9 authors 5 3
Submitted by natnitaract 7 FinCoT: Grounding Chain-of-Thought in Expert Financial Reasoning · 6 authors 9 2
Submitted by LogicTrainer 6 TC-Light: Temporally Consistent Relighting for Dynamic Long Videos · 9 authors 18 1
Submitted by kittttttt 6 ReDit: Reward Dithering for Improved LLM Policy Optimization · 6 authors 1 1
Submitted by kamahori 6 ConsumerBench: Benchmarking Generative AI Applications on End-User Devices · 6 authors 3 1
Submitted by vanshs1 5 Steering Conceptual Bias via Transformer Latent-Subspace Activation · 2 authors 1
Submitted by seonglae 5 FaithfulSAE: Towards Capturing Faithful Features with Sparse Autoencoders without External Dataset Dependencies · 6 authors 2 1
Submitted by Neo111x 4 I Know Which LLM Wrote Your Code Last Summer: LLM generated Code Stylometry for Authorship Attribution · 9 authors 1 1
Submitted by shuoxing 4 Demystifying the Visual Quality Paradox in Multimodal Large Language Models · 8 authors 2
Submitted by Shoubin 3 4D-LRM: Large Space-Time Reconstruction Model From and To Any View at Any Time · 13 authors 1
Submitted by BoKelvin 3 GEMeX-ThinkVG: Towards Thinking with Visual Grounding in Medical VQA via Reinforcement Learning · 6 authors 1
Submitted by akanatas 3 CultureMERT: Continual Pre-Training for Cross-Cultural Music Representation Learning · 3 authors 1
Submitted by xunguangwang 3 SoK: Evaluating Jailbreak Guardrails for Large Language Models · 6 authors 2
Submitted by tahirakazimi77 2 Audit & Repair: An Agentic Framework for Consistent Story Visualization in Text-to-Image Diffusion Models · 3 authors 1
Submitted by Yeongtak 2 RePIC: Reinforced Post-Training for Personalizing Multi-Modal Language Models · 7 authors 1
Submitted by rajandasgupta 2 A deep learning and machine learning approach to predict neonatal death in the context of São Paulo · 9 authors 2
Submitted by kevin1020 2 Spec2RTL-Agent: Automated Hardware Code Generation from Complex Specifications Using LLM Agent Systems · 6 authors 2
Submitted by xwjzds 1 Quantifying Fairness in LLMs Beyond Tokens: A Semantic and Statistical Perspective · 7 authors 1