Training language models to follow instructions with human feedback Paper • 2203.02155 • Published Mar 4, 2022 • 22
Seedance 1.0: Exploring the Boundaries of Video Generation Models Paper • 2506.09113 • Published Jun 10 • 102
OmniConsistency: Learning Style-Agnostic Consistency from Paired Stylization Data Paper • 2505.18445 • Published May 24 • 65
KRIS-Bench: Benchmarking Next-Level Intelligent Image Editing Models Paper • 2505.16707 • Published May 22 • 46
NovelSeek: When Agent Becomes the Scientist -- Building Closed-Loop System from Hypothesis to Verification Paper • 2505.16938 • Published May 22 • 121
Teller: Real-Time Streaming Audio-Driven Portrait Animation with Autoregressive Motion Generation Paper • 2503.18429 • Published Mar 24 • 3
Step1X-3D: Towards High-Fidelity and Controllable Generation of Textured 3D Assets Paper • 2505.07747 • Published May 12 • 61
Step1X-Edit: A Practical Framework for General Image Editing Paper • 2504.17761 • Published Apr 24 • 93
DeepSeek-R1 Thoughtology: Let's <think> about LLM Reasoning Paper • 2504.07128 • Published Apr 2 • 87
Kimi-VL-A3B Collection Moonshot's efficient MoE VLMs, exceptional on agent, long-context, and thinking • 7 items • Updated Jul 1 • 75
Audio-visual Controlled Video Diffusion with Masked Selective State Spaces Modeling for Natural Talking Head Generation Paper • 2504.02542 • Published Apr 3 • 50
Vidu: a Highly Consistent, Dynamic and Skilled Text-to-Video Generator with Diffusion Models Paper • 2405.04233 • Published May 7, 2024 • 2