LongVU - a Vision-CAIR Collection

Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Vision-CAIR 's Collections

LongVU

LongVU

updated Oct 31, 2024

Vision-CAIR/LongVU_Qwen2_7B

Video-Text-to-Text • 8B • Updated Feb 28 • 319 • 72
Vision-CAIR/LongVU_Llama3_2_3B

Video-Text-to-Text • Updated Feb 28 • 51 • 7
Vision-CAIR/LongVU_Llama3_2_3B_img

Updated Feb 28 • 10 • 6
Vision-CAIR/LongVU_Qwen2_7B_img

Updated Feb 28 • 12 • 5
Vision-CAIR/LongVU_Llama3_2_1B

Video-Text-to-Text • Updated Feb 28 • 55 • 11
Running on Zero

83

83

LongVU

🌖

Generate responses to video or image inputs
LongVU: Spatiotemporal Adaptive Compression for Long Video-Language Understanding

Paper • 2410.17434 • Published Oct 22, 2024 • 30

Collection guide
Browse collections

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs