Submitted by cg1177 49 Eagle 2.5: Boosting Long-Context Post-Training for Frontier Vision-Language Models · 21 authors 4
Submitted by salmannyu 24 X-Teaming: Multi-Turn Jailbreaks and Defenses with Adaptive Multi-Agents · 10 authors 1
Submitted by mpark 23 SphereDiff: Tuning-free Omnidirectional Panoramic Image and Video Generation via Spherical Latent Representation · 5 authors 1
Submitted by wchengad 21 StyleMe3D: Stylization with Disentangled Priors by Multiple Encoders on 3D Gaussians · 10 authors 1
Submitted by saxon 17 THOUGHTTERMINATOR: Benchmarking, Calibrating, and Mitigating Overthinking in Reasoning Models · 4 authors 1
Submitted by Ningyu 15 EasyEdit2: An Easy-to-use Steering Framework for Editing Large Language Models · 10 authors 1
Submitted by frog123123123123 13 Seeing from Another Perspective: Evaluating Multi-View Understanding in MLLMs · 10 authors 1
Submitted by Swtheking 13 LeetCodeDataset: A Temporal Dataset for Robust Evaluation and Efficient Training of Code LLMs · 8 authors 1
Submitted by ewrfcas 11 Uni3C: Unifying Precisely 3D-Enhanced Camera and Human Motion Controls for Video Generation · 8 authors 1
Submitted by pengxiang 11 InfiGUI-R1: Advancing Multimodal GUI Agents from Reactive Actors to Deliberative Reasoners · 8 authors 1
Submitted by Yuxiang007 8 LearnAct: Few-Shot Mobile GUI Agent with a Unified Demonstration Benchmark · 9 authors 1
Submitted by manuelkansy 7 LookingGlass: Generative Anamorphoses via Laplacian Pyramid Warping · 5 authors 5
Submitted by bys0318 6 An LMM for Efficient Video Understanding via Reinforced Compression of Video Cubes · 7 authors 1
Submitted by SieraL 5 NEMOTRON-CROSSTHINK: Scaling Self-Learning beyond Math Reasoning · 11 authors 3
Submitted by quyanh 4 RainbowPlus: Enhancing Adversarial Prompt Generation via Evolutionary Quality-Diversity Search · 3 authors 4
Submitted by RanjanSapkota 3 RF-DETR Object Detection vs YOLOv12 : A Study of Transformer-based and CNN-based Architectures for Single-Class and Multi-Class Greenfruit Detection in Complex Orchard Environments Under Label Ambiguity · 4 authors 1
Submitted by ChenWu98 1 Roll the dice & look before you leap: Going beyond the creative limits of next-token prediction · 4 authors 1
Submitted by reyavir 1 PROMPTEVALS: A Dataset of Assertions and Guardrails for Custom Production Large Language Model Pipelines · 5 authors 1
Submitted by nielsr 1 LoftUp: Learning a Coordinate-Based Feature Upsampler for Vision Foundation Models · 5 authors 1
Submitted by tnngo2 1 SilVar-Med: A Speech-Driven Visual Language Model for Explainable Abnormality Detection in Medical Imaging · 6 authors 1