InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity Paper β’ 2503.16418 β’ Published Mar 20 β’ 35
Flash-VStream: Memory-Based Real-Time Understanding for Long Video Streams Paper β’ 2406.08085 β’ Published Jun 12, 2024 β’ 17
Coarse Correspondence Elicit 3D Spacetime Understanding in Multimodal Language Model Paper β’ 2408.00754 β’ Published Aug 1, 2024 β’ 25
MotionLCM: Real-time Controllable Motion Generation via Latent Consistency Model Paper β’ 2404.19759 β’ Published Apr 30, 2024 β’ 28
KV-Edit: Training-Free Image Editing for Precise Background Preservation Paper β’ 2502.17363 β’ Published Feb 24 β’ 36
VoCo-LLaMA: Towards Vision Compression with Large Language Models Paper β’ 2406.12275 β’ Published Jun 18, 2024 β’ 32
KV-Edit: Training-Free Image Editing for Precise Background Preservation Paper β’ 2502.17363 β’ Published Feb 24 β’ 36 β’ 3
KV-Edit: Training-Free Image Editing for Precise Background Preservation Paper β’ 2502.17363 β’ Published Feb 24 β’ 36 β’ 3
LayerTracer: Cognitive-Aligned Layered SVG Synthesis via Diffusion Transformer Paper β’ 2502.01105 β’ Published Feb 3 β’ 20