Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2412.10316

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6 • 25
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6 • 12
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7 • 39
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7 • 20

AI Paper of the Day

A collection of papers that I think are interesting, one added each day

about 11 hours ago

Can Large Language Models Understand Context?

Paper • 2402.00858 • Published Feb 1 • 21
OLMo: Accelerating the Science of Language Models

Paper • 2402.00838 • Published Feb 1 • 82
Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18 • 145
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity

Paper • 2401.17072 • Published Jan 30 • 25

TencentARC/BrushEdit

Image-to-Image • Updated 10 days ago • 21
Running on Zero

57

🤠

BrushEdit
BrushEdit: All-In-One Image Inpainting and Editing

Paper • 2412.10316 • Published 13 days ago • 33

BrushEdit: All-In-One Image Inpainting and Editing

Paper • 2412.10316 • Published 13 days ago • 33
ColorFlow: Retrieval-Augmented Image Sequence Colorization

Paper • 2412.11815 • Published 10 days ago • 26
FluxSpace: Disentangled Semantic Editing in Rectified Flow Transformers

Paper • 2412.09611 • Published 13 days ago • 9
FireFlow: Fast Inversion of Rectified Flow for Image Semantic Editing

Paper • 2412.07517 • Published 16 days ago • 11

GenEx: Generating an Explorable World

Paper • 2412.09624 • Published 13 days ago • 84
Multimodal Music Generation with Explicit Bridges and Retrieval Augmentation

Paper • 2412.09428 • Published 14 days ago • 7
BrushEdit: All-In-One Image Inpainting and Editing

Paper • 2412.10316 • Published 13 days ago • 33
FashionComposer: Compositional Fashion Image Generation

Paper • 2412.14168 • Published 7 days ago • 16

Inpainting-based image insturction editing

TencentARC/BrushEdit

Image-to-Image • Updated 10 days ago • 21
BrushEdit: All-In-One Image Inpainting and Editing

Paper • 2412.10316 • Published 13 days ago • 33
Running on Zero

57

🤠

BrushEdit

Edify Image: High-Quality Image Generation with Pixel Space Laplacian Diffusion Models

Paper • 2411.07126 • Published Nov 11 • 28
Collaborative Decoding Makes Visual Auto-Regressive Modeling Efficient

Paper • 2411.17787 • Published 30 days ago • 11
TokenFlow: Unified Image Tokenizer for Multimodal Understanding and Generation

Paper • 2412.03069 • Published 22 days ago • 30
BrushEdit: All-In-One Image Inpainting and Editing

Paper • 2412.10316 • Published 13 days ago • 33

Gen AI Diffusion

Animate-X: Universal Character Image Animation with Enhanced Motion Representation

Paper • 2410.10306 • Published Oct 14 • 54
ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning

Paper • 2411.05003 • Published Nov 7 • 70
TIP-I2V: A Million-Scale Real Text and Image Prompt Dataset for Image-to-Video Generation

Paper • 2411.04709 • Published Nov 5 • 25
IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation

Paper • 2410.07171 • Published Oct 9 • 41

MobileCLIP: Fast Image-Text Models through Multi-Modal Reinforced Training

Paper • 2311.17049 • Published Nov 28, 2023 • 1
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

Paper • 2405.04434 • Published May 7 • 14
A Study of Autoregressive Decoders for Multi-Tasking in Computer Vision

Paper • 2303.17376 • Published Mar 30, 2023
Sigmoid Loss for Language Image Pre-Training

Paper • 2303.15343 • Published Mar 27, 2023 • 5

Applied Machine Learning Papers

Reading List (Mainly Focused of VLM's and Diffusion Models)

Scalable Diffusion Models with Transformers

Paper • 2212.09748 • Published Dec 19, 2022 • 17
Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets

Paper • 2311.15127 • Published Nov 25, 2023 • 12
Learning Transferable Visual Models From Natural Language Supervision

Paper • 2103.00020 • Published Feb 26, 2021 • 11
U-Net: Convolutional Networks for Biomedical Image Segmentation

Paper • 1505.04597 • Published May 18, 2015 • 8

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs