A Comprehensive Survey on Continual Learning in Generative Models Paper • 2506.13045 • Published 10 days ago
Aligned Better, Listen Better for Audio-Visual Large Language Models Paper • 2504.02061 • Published Apr 2
AnimeShooter: A Multi-Shot Animation Dataset for Reference-Guided Video Generation Paper • 2506.03126 • Published 23 days ago • 22
Video-Holmes: Can MLLM Think Like Holmes for Complex Video Reasoning? Paper • 2505.21374 • Published 30 days ago • 26
ProtoGCD: Unified and Unbiased Prototype Learning for Generalized Category Discovery Paper • 2504.03755 • Published Apr 2 • 1
ProtoGCD: Unified and Unbiased Prototype Learning for Generalized Category Discovery Paper • 2504.03755 • Published Apr 2 • 1
ProtoGCD: Unified and Unbiased Prototype Learning for Generalized Category Discovery Paper • 2504.03755 • Published Apr 2 • 1 • 2
Exploring the Effect of Reinforcement Learning on Video Understanding: Insights from SEED-Bench-R1 Paper • 2503.24376 • Published Mar 31 • 38
AnimeGamer: Infinite Anime Life Simulation with Next Game State Prediction Paper • 2504.01014 • Published Apr 1 • 70
MSPE: Multi-Scale Patch Embedding Prompts Vision Transformers to Any Resolution Paper • 2405.18240 • Published May 28, 2024
Dual Mean-Teacher: An Unbiased Semi-Supervised Framework for Audio-Visual Source Localization Paper • 2403.03145 • Published Mar 5, 2024
Cross Pseudo-Labeling for Semi-Supervised Audio-Visual Source Localization Paper • 2403.03095 • Published Mar 5, 2024
GenHancer: Imperfect Generative Models are Secretly Strong Vision-Centric Enhancers Paper • 2503.19480 • Published Mar 25 • 16
GenHancer: Imperfect Generative Models are Secretly Strong Vision-Centric Enhancers Paper • 2503.19480 • Published Mar 25 • 16 • 2
WPS-SAM: Towards Weakly-Supervised Part Segmentation with Foundation Models Paper • 2407.10131 • Published Jul 14, 2024