HiWave: Training-Free High-Resolution Image Generation via Wavelet-Based Diffusion Sampling Paper • 2506.20452 • Published 17 days ago • 17
JarvisArt: Liberating Human Artistic Creativity via an Intelligent Photo Retouching Agent Paper • 2506.17612 • Published 21 days ago • 61
ShapeLLM-Omni: A Native Multimodal LLM for 3D Generation and Understanding Paper • 2506.01853 • Published Jun 2 • 30
Learning Video Generation for Robotic Manipulation with Collaborative Trajectory Control Paper • 2506.01943 • Published Jun 2 • 24
🌞 May 2025 - Open works from the Chinese community Collection 43 items • Updated 24 days ago • 9
view article Article How to Build an MCP Server with Gradio By abidlabs and 1 other • Apr 30 • 178
Emerging Properties in Unified Multimodal Pretraining Paper • 2505.14683 • Published May 20 • 131
Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning Paper • 2505.03318 • Published May 6 • 94
🌸 April 2025 - Open releases from the Chinese community Collection 42 items • Updated 24 days ago • 13
view article Article Tiny Agents: a MCP-powered agent in 50 lines of code By julien-c • Apr 25 • 285
Uni3C: Unifying Precisely 3D-Enhanced Camera and Human Motion Controls for Video Generation Paper • 2504.14899 • Published Apr 21 • 21
SkyReels-V2 Collection Infinite-length Film Generative Model • 17 items • Updated 28 days ago • 46
NormalCrafter: Learning Temporally Consistent Normals from Video Diffusion Priors Paper • 2504.11427 • Published Apr 15 • 19
Visual Chronicles: Using Multimodal LLMs to Analyze Massive Collections of Images Paper • 2504.08727 • Published Apr 11 • 11
Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model Paper • 2504.08685 • Published Apr 11 • 129
SkillWeaver: Web Agents can Self-Improve by Discovering and Honing Skills Paper • 2504.07079 • Published Apr 9 • 11
VisualCloze: A Universal Image Generation Framework via Visual In-Context Learning Paper • 2504.07960 • Published Apr 10 • 49