Training-Free Motion-Guided Video Generation with Enhanced Temporal Consistency Using Motion Consistency Loss Paper • 2501.07563 • Published Jan 13, 2025 • 1
VQ-VA World: Towards High-Quality Visual Question-Visual Answering Paper • 2511.20573 • Published Nov 25, 2025 • 7
Emerging Properties in Unified Multimodal Pretraining Paper • 2505.14683 • Published May 20, 2025 • 133
MeshAnything V2: Artist-Created Mesh Generation With Adjacent Mesh Tokenization Paper • 2408.02555 • Published Aug 5, 2024 • 31
Rejuvenating image-GPT as Strong Visual Representation Learners Paper • 2312.02147 • Published Dec 4, 2023 • 7
What If We Recaption Billions of Web Images with LLaMA-3? Paper • 2406.08478 • Published Jun 12, 2024 • 41
Vidu4D: Single Generated Video to High-Fidelity 4D Reconstruction with Dynamic Gaussian Surfels Paper • 2405.16822 • Published May 27, 2024 • 12
V3D: Video Diffusion Models are Effective 3D Generators Paper • 2403.06738 • Published Mar 11, 2024 • 30
CLIPA-v2: Scaling CLIP Training with 81.1% Zero-shot ImageNet Accuracy within a \$10,000 Budget; An Extra \$4,000 Unlocks 81.8% Accuracy Paper • 2306.15658 • Published Jun 27, 2023 • 12