Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2503.23461

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6, 2024 • 27
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6, 2024 • 13
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7, 2024 • 43
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7, 2024 • 22

AI Paper of the Day

A collection of papers that I think are interesting, one added each day

about 8 hours ago

Can Large Language Models Understand Context?

Paper • 2402.00858 • Published Feb 1, 2024 • 23
OLMo: Accelerating the Science of Language Models

Paper • 2402.00838 • Published Feb 1, 2024 • 83
Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18, 2024 • 147
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity

Paper • 2401.17072 • Published Jan 30, 2024 • 25

Latent Space Super-Resolution for Higher-Resolution Image Generation with Diffusion Models

Paper • 2503.18446 • Published 11 days ago • 9
Unconditional Priors Matter! Improving Conditional Generation of Fine-Tuned Diffusion Models

Paper • 2503.20240 • Published 9 days ago • 21
BizGen: Advancing Article-level Visual Text Rendering for Infographics Generation

Paper • 2503.20672 • Published 8 days ago • 13
Beyond Words: Advancing Long-Text Image Generation via Multimodal Autoregressive Models

Paper • 2503.20198 • Published 9 days ago • 4

K-LoRA: Unlocking Training-Free Fusion of Any Subject and Style LoRAs

Paper • 2502.18461 • Published Feb 25 • 15
Semantic Image Inversion and Editing using Rectified Stochastic Differential Equations

Paper • 2410.10792 • Published Oct 14, 2024 • 30
Rewards Are Enough for Fast Photo-Realistic Text-to-image Generation

Paper • 2503.13070 • Published 18 days ago • 9
DreamRenderer: Taming Multi-Instance Attribute Control in Large-Scale Text-to-Image Models

Paper • 2503.12885 • Published 18 days ago • 42

research papers

SIFT: Grounding LLM Reasoning in Contexts via Stickers

Paper • 2502.14922 • Published Feb 19 • 30
TextCrafter: Accurately Rendering Multiple Texts in Complex Visual Scenes

Paper • 2503.23461 • Published 5 days ago • 72

checkitoutlater

MotionShop: Zero-Shot Motion Transfer in Video Diffusion Models with Mixture of Score Guidance

Paper • 2412.05355 • Published Dec 6, 2024 • 9
SwiftEdit: Lightning Fast Text-Guided Image Editing via One-Step Diffusion

Paper • 2412.04301 • Published Dec 5, 2024 • 38
PanoDreamer: 3D Panorama Synthesis from a Single Image

Paper • 2412.04827 • Published Dec 6, 2024 • 11
Around the World in 80 Timesteps: A Generative Approach to Global Visual Geolocation

Paper • 2412.06781 • Published Dec 9, 2024 • 21

Compose and Conquer: Diffusion-Based 3D Depth Aware Composable Image Synthesis

Paper • 2401.09048 • Published Jan 17, 2024 • 10
Improving fine-grained understanding in image-text pre-training

Paper • 2401.09865 • Published Jan 18, 2024 • 17
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data

Paper • 2401.10891 • Published Jan 19, 2024 • 60
Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild

Paper • 2401.13627 • Published Jan 24, 2024 • 74

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs