CalliReader: Contextualizing Chinese Calligraphy via an Embedding-Aligned Vision-Language Model Paper • 2503.06472 • Published Mar 9, 2025 • 8
MMMG: A Massive, Multidisciplinary, Multi-Tier Generation Benchmark for Text-to-Image Reasoning Paper • 2506.10963 • Published Jun 12, 2025 • 9
ART: Anonymous Region Transformer for Variable Multi-Layer Transparent Image Generation Paper • 2502.18364 • Published Feb 25, 2025 • 36
Step-aware Preference Optimization: Aligning Preference with Denoising Performance at Each Step Paper • 2406.04314 • Published Jun 6, 2024 • 30