Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2408.08172

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6 • 25
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6 • 12
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7 • 38
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7 • 19

Towards flexible perception with visual memory

Paper • 2408.08172 • Published Aug 15 • 19

Complexity of Symbolic Representation in Working Memory of Transformer Correlates with the Complexity of a Task

Paper • 2406.14213 • Published Jun 20 • 20
Towards flexible perception with visual memory

Paper • 2408.08172 • Published Aug 15 • 19

CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data

Paper • 2404.15653 • Published Apr 24 • 26
MoDE: CLIP Data Experts via Clustering

Paper • 2404.16030 • Published Apr 24 • 12
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning

Paper • 2405.12130 • Published May 20 • 45
Reducing Transformer Key-Value Cache Size with Cross-Layer Attention

Paper • 2405.12981 • Published May 21 • 28

VastGaussian: Vast 3D Gaussians for Large Scene Reconstruction

Paper • 2402.17427 • Published Feb 27 • 9
Disentangled 3D Scene Generation with Layout Learning

Paper • 2402.16936 • Published Feb 26 • 10
RadSplat: Radiance Field-Informed Gaussian Splatting for Robust Real-Time Rendering with 900+ FPS

Paper • 2403.13806 • Published Mar 20 • 18
DreamCar: Leveraging Car-specific Prior for in-the-wild 3D Car Reconstruction

Paper • 2407.16988 • Published Jul 24 • 7

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs