PaperFit: Vision-in-the-Loop Typesetting Optimization for Scientific Documents Paper • 2605.10341 • Published 5 days ago • 31
Mean Mode Screaming: Mean--Variance Split Residuals for 1000-Layer Diffusion Transformers Paper • 2605.06169 • Published 9 days ago • 182
Perceptual Flow Network for Visually Grounded Reasoning Paper • 2605.02730 • Published 12 days ago • 6
The Continuity Layer: Why Intelligence Needs an Architecture for What It Carries Forward Paper • 2604.17273 • Published 27 days ago • 3
LLaDA2.0-Uni: Unifying Multimodal Understanding and Generation with Diffusion Large Language Model Paper • 2604.20796 • Published 24 days ago • 240
Benchmarking and Mechanistic Analysis of Vision-Language Models for Cross-Depiction Assembly Instruction Alignment Paper • 2604.00913 • Published Apr 1 • 4
CARLA-Air: Fly Drones Inside a CARLA World -- A Unified Infrastructure for Air-Ground Embodied Intelligence Paper • 2603.28032 • Published Mar 30 • 342
FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization Paper • 2603.19835 • Published Mar 20 • 350
inference-optimization/gpt-oss-120b-from-qwen235b-then-self-ckpt4-speculator.eagle3 0.9B • Updated Apr 1 • 2 • 1
Astrolabe: Steering Forward-Process Reinforcement Learning for Distilled Autoregressive Video Models Paper • 2603.17051 • Published Mar 17 • 109
SocialOmni: Benchmarking Audio-Visual Social Interactivity in Omni Models Paper • 2603.16859 • Published Mar 17 • 248