big_vision

company

https://github.com/google-research/big_vision

AI & ML interests

None defined yet.

authored 2 papers over 1 year ago

SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features

Paper • 2502.14786 • Published Feb 20, 2025 • 165

PaliGemma 2: A Family of Versatile VLMs for Transfer

Paper • 2412.03555 • Published Dec 4, 2024 • 136

updated a Space almost 2 years ago

PaliGemma Demo

Annotate and describe images with text prompts

authored a paper almost 2 years ago

PaliGemma: A versatile 3B VLM for transfer

Paper • 2407.07726 • Published Jul 10, 2024 • 73

authored 2 papers over 2 years ago

GIVT: Generative Infinite-Vocabulary Transformers

Paper • 2312.02116 • Published Dec 4, 2023 • 12

Finite Scalar Quantization: VQ-VAE Made Simple

Paper • 2309.15505 • Published Sep 27, 2023 • 24

authored a paper almost 3 years ago

Image Captioners Are Scalable Vision Learners Too

Paper • 2306.07915 • Published Jun 13, 2023 • 12