Long-Context Autoregressive Video Modeling with Next-Frame Prediction Paper • 2503.19325 • Published Mar 25 • 73
CoMP: Continual Multimodal Pre-training for Vision Foundation Models Paper • 2503.18931 • Published Mar 24 • 30