microsoft/Phi-4-multimodal-instruct Automatic Speech Recognition • Updated 5 minutes ago • 29.6k • 767
ART: Anonymous Region Transformer for Variable Multi-Layer Transparent Image Generation Paper • 2502.18364 • Published 7 days ago • 31 • 4
ART: Anonymous Region Transformer for Variable Multi-Layer Transparent Image Generation Paper • 2502.18364 • Published 7 days ago • 31
Glyph-ByT5-v2: A Strong Aesthetic Baseline for Accurate Multilingual Visual Text Rendering Paper • 2406.10208 • Published Jun 14, 2024 • 22
Glyph-ByT5-v2: A Strong Aesthetic Baseline for Accurate Multilingual Visual Text Rendering Paper • 2406.10208 • Published Jun 14, 2024 • 22 • 2
FontStudio: Shape-Adaptive Diffusion Model for Coherent and Consistent Font Effect Generation Paper • 2406.08392 • Published Jun 12, 2024 • 20
Step-aware Preference Optimization: Aligning Preference with Denoising Performance at Each Step Paper • 2406.04314 • Published Jun 6, 2024 • 28
playgroundai/playground-v2.5-1024px-aesthetic Text-to-Image • Updated Mar 15, 2024 • 298k • • 709