-
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining
Paper • 2501.00958 • Published • 92 -
OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis
Paper • 2412.19723 • Published • 78 -
PERSE: Personalized 3D Generative Avatars from A Single Portrait
Paper • 2412.21206 • Published • 15 -
Training Software Engineering Agents and Verifiers with SWE-Gym
Paper • 2412.21139 • Published • 20
Collections
Discover the best community collections!
Collections including paper arxiv:2412.21206
-
OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis
Paper • 2412.19723 • Published • 78 -
PERSE: Personalized 3D Generative Avatars from A Single Portrait
Paper • 2412.21206 • Published • 15 -
Efficiently Serving LLM Reasoning Programs with Certaindex
Paper • 2412.20993 • Published • 33
-
Media2Face: Co-speech Facial Animation Generation With Multi-Modality Guidance
Paper • 2401.15687 • Published • 23 -
Gaussian Head Avatar: Ultra High-fidelity Head Avatar via Dynamic Gaussians
Paper • 2312.03029 • Published • 23 -
DREAM-Talk: Diffusion-based Realistic Emotional Audio-driven Method for Single Image Talking Face Generation
Paper • 2312.13578 • Published • 27 -
Splatter Image: Ultra-Fast Single-View 3D Reconstruction
Paper • 2312.13150 • Published • 14
-
DREAM-Talk: Diffusion-based Realistic Emotional Audio-driven Method for Single Image Talking Face Generation
Paper • 2312.13578 • Published • 27 -
Splatter Image: Ultra-Fast Single-View 3D Reconstruction
Paper • 2312.13150 • Published • 14 -
Gaussian Head Avatar: Ultra High-fidelity Head Avatar via Dynamic Gaussians
Paper • 2312.03029 • Published • 23 -
Relightable Gaussian Codec Avatars
Paper • 2312.03704 • Published • 29