Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Ianna Chan's picture

3

Ianna Chan

moegi161

moegi161

AI & ML interests

Image synthesis, vision and language

Organizations

None yet

Collections 2

Vision and language

No "Zero-Shot" Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance

Paper • 2404.04125 • Published Apr 4, 2024 • 30
CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching

Paper • 2404.03653 • Published Apr 4, 2024 • 37
Cross-Attention Makes Inference Cumbersome in Text-to-Image Diffusion Models

Paper • 2404.02747 • Published Apr 3, 2024 • 13
3D Congealing: 3D-Aware Image Alignment in the Wild

Paper • 2404.02125 • Published Apr 2, 2024 • 10

SpatialTracker: Tracking Any 2D Pixels in 3D Space

Paper • 2404.04319 • Published Apr 5, 2024 • 26
Probing the 3D Awareness of Visual Foundation Models

Paper • 2404.08636 • Published Apr 12, 2024 • 14

Vision and language

No "Zero-Shot" Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance

Paper • 2404.04125 • Published Apr 4, 2024 • 30
CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching

Paper • 2404.03653 • Published Apr 4, 2024 • 37
Cross-Attention Makes Inference Cumbersome in Text-to-Image Diffusion Models

Paper • 2404.02747 • Published Apr 3, 2024 • 13
3D Congealing: 3D-Aware Image Alignment in the Wild

Paper • 2404.02125 • Published Apr 2, 2024 • 10

SpatialTracker: Tracking Any 2D Pixels in 3D Space

Paper • 2404.04319 • Published Apr 5, 2024 • 26
Probing the 3D Awareness of Visual Foundation Models

Paper • 2404.08636 • Published Apr 12, 2024 • 14

models 0

None public yet

datasets 0

None public yet

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs