Video-RTS: Rethinking Reinforcement Learning and Test-Time Scaling for Efficient and Enhanced Video Reasoning Paper • 2507.06485 • Published Jul 9 • 4
MEXA: Towards General Multimodal Reasoning with Dynamic Multi-Expert Aggregation Paper • 2506.17113 • Published Jun 20 • 4
Frame Guidance: Training-Free Guidance for Frame-Level Control in Video Diffusion Models Paper • 2506.07177 • Published Jun 8 • 22
Frame Guidance: Training-Free Guidance for Frame-Level Control in Video Diffusion Models Paper • 2506.07177 • Published Jun 8 • 22
Bitwidth Heterogeneous Federated Learning with Progressive Weight Dequantization Paper • 2202.11453 • Published Feb 23, 2022
Training-free Guidance in Text-to-Video Generation via Multimodal Planning and Structured Noise Initialization Paper • 2504.08641 • Published Apr 11 • 7
Video-Skill-CoT: Skill-based Chain-of-Thoughts for Domain-Adaptive Video Reasoning Paper • 2506.03525 • Published Jun 4 • 6
Video-Skill-CoT: Skill-based Chain-of-Thoughts for Domain-Adaptive Video Reasoning Paper • 2506.03525 • Published Jun 4 • 6
EPiC: Efficient Video Camera Control Learning with Precise Anchor-Video Guidance Paper • 2505.21876 • Published May 28 • 9
EPiC: Efficient Video Camera Control Learning with Precise Anchor-Video Guidance Paper • 2505.21876 • Published May 28 • 9
Distilling LLM Agent into Small Models with Retrieval and Code Tools Paper • 2505.17612 • Published May 23 • 81
RSQ: Learning from Important Tokens Leads to Better Quantized LLMs Paper • 2503.01820 • Published Mar 3 • 2 • 3
RSQ: Learning from Important Tokens Leads to Better Quantized LLMs Paper • 2503.01820 • Published Mar 3 • 2
RSQ: Learning from Important Tokens Leads to Better Quantized LLMs Paper • 2503.01820 • Published Mar 3 • 2
UPCORE: Utility-Preserving Coreset Selection for Balanced Unlearning Paper • 2502.15082 • Published Feb 20 • 1
Mol-LLaMA: Towards General Understanding of Molecules in Large Molecular Language Model Paper • 2502.13449 • Published Feb 19 • 46
SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features Paper • 2502.14786 • Published Feb 20 • 146