CoLLM: A Large Language Model for Composed Image Retrieval Paper • 2503.19910 • Published Mar 25 • 14
LOCATEdit: Graph Laplacian Optimized Cross Attention for Localized Text-Guided Image Editing Paper • 2503.21541 • Published Mar 27 • 1
HumanDreamer-X: Photorealistic Single-image Human Avatars Reconstruction via Gaussian Restoration Paper • 2504.03536 • Published Apr 4 • 13
FantasyTalking: Realistic Talking Portrait Generation via Coherent Motion Synthesis Paper • 2504.04842 • Published Apr 7 • 36
Unlock Pose Diversity: Accurate and Efficient Implicit Keypoint-based Spatiotemporal Diffusion for Audio-driven Talking Portrait Paper • 2503.12963 • Published Mar 17 • 7
RASA: Replace Anyone, Say Anything -- A Training-Free Framework for Audio-Driven and Universal Portrait Video Editing Paper • 2503.11571 • Published Mar 14
VisualCloze: A Universal Image Generation Framework via Visual In-Context Learning Paper • 2504.07960 • Published 29 days ago • 48
SuperEdit: Rectifying and Facilitating Supervision for Instruction-Based Image Editing Paper • 2505.02370 • Published 5 days ago • 12
In-Context Edit: Enabling Instructional Image Editing with In-Context Generation in Large Scale Diffusion Transformer Paper • 2504.20690 • Published 10 days ago • 18