Holo1 Collection Vision-Language Action Model for use in Surfer-H web navigation agent • 5 items • Updated 1 day ago • 35
AGUVIS: Unified Pure Vision GUI Agents Collection https://aguvis-project.github.io • 3 items • Updated Dec 20, 2024 • 6
Multimodal Models Collection Multimodal models with leading performance. • 17 items • Updated Jan 17 • 35
view article Article Vision Language Models (Better, Faster, Stronger) By merve and 4 others • 25 days ago • 414
video-effects datasets Collection Smol datasets to emulate cool video effects like "squish", "dissolve", etc. Inspired by Pika effects. • 4 items • Updated Jan 28 • 4
AIMv2 Collection A collection of AIMv2 vision encoders that supports a number of resolutions, native resolution, and a distilled checkpoint. • 19 items • Updated Nov 22, 2024 • 77
Coursera - Hands-on Data Centric Visual AI Collection This collection has the in-class lecture and homework datasets for the Coursera MOOC, Hands-on Data Centric Visual AI. • 4 items • Updated Jul 31, 2024 • 2
🍃 MINT-1T Collection Data for "MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens" • 13 items • Updated Jul 24, 2024 • 60
view article Article The CVPR Survival Guide: Discovering Research That's Interesting to YOU! By harpreetsahota • Jun 14, 2024 • 9
view article Article FiftyOne Computer Vision Datasets Come to the Hugging Face Hub By jamarks • Jun 3, 2024 • 12
view article Article SeeMoE: Implementing a MoE Vision Language Model from Scratch By AviSoori1x • Jun 23, 2024 • 34
view article Article seemore: Implement a Vision Language Model from Scratch By AviSoori1x • Jun 23, 2024 • 88
view article Article Streamline Computer Vision Workflows with Hugging Face Transformers and FiftyOne By jamarks • Feb 27, 2024 • 8
DeciDiffusion Models Collection The DeciDiffusion family of models are text-to-image diffusion models which are faster, yet generate on par images, than Stable Diffusion v1.6 • 4 items • Updated Jan 17, 2024 • 1
DeciLM Models Collection DeciLMs are small, but mighty, language models. Members of the DeciLM family of models include 6 and 7 billion parameter models. • 7 items • Updated Jan 17, 2024 • 3
DeciCoder Models Collection DeciCoder models are fast, efficient, and accurate code-generation models. This family of models include: DeciCoder-1B and DeciCoder-6B. • 4 items • Updated Jan 17, 2024 • 1
Instruction-Following Evaluation for Large Language Models Paper • 2311.07911 • Published Nov 14, 2023 • 20