Submitted by akhaliq 75 Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild · 9 authors 15
Submitted by akhaliq 32 WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models · 8 authors 4
Submitted by akhaliq 14 BootPIG: Bootstrapping Zero-shot Personalized Image Generation Capabilities in Pretrained Diffusion Models · 4 authors 1
Submitted by akhaliq 13 SpacTor-T5: Pre-training T5 Models with Span Corruption and Replaced Token Detection · 9 authors 2
Submitted by akhaliq 12 UNIMO-G: Unified Image Generation through Multimodal Conditional Diffusion · 4 authors 3
Submitted by akhaliq 12 ConTextual: Evaluating Context-Sensitive Text-Rich Visual Reasoning in Large Multimodal Models · 4 authors 1
Submitted by akhaliq 11 CreativeSynth: Creative Blending and Synthesis of Visual Arts based on Multimodal Diffusion · 8 authors 1