Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
xing0047 's Collections
MultimodalAR
Scene-Graph
interleave
Multi-image
SAM
long-context-mllm

long-context-mllm

updated Oct 27, 2024
Upvote
-

  • Visual Context Window Extension: A New Perspective for Long Video Understanding

    Paper • 2409.20018 • Published Sep 30, 2024 • 11

  • LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via Hybrid Architecture

    Paper • 2409.02889 • Published Sep 4, 2024 • 55

  • Long Context Transfer from Language to Vision

    Paper • 2406.16852 • Published Jun 24, 2024 • 34

  • lmms-lab/LongVA-7B-DPO

    Text Generation • 8B • Updated Jun 26, 2024 • 451 • 8

  • lmms-lab/LongVA-7B

    Text Generation • 8B • Updated Jun 26, 2024 • 630 • 15

  • FreedomIntelligence/LongLLaVA-9B

    Image-Text-to-Text • 10B • Updated Oct 12, 2024 • 54 • 4

  • VideoLLaMB: Long-context Video Understanding with Recurrent Memory Bridges

    Paper • 2409.01071 • Published Sep 2, 2024 • 28

  • Why Does the Effective Context Length of LLMs Fall Short?

    Paper • 2410.18745 • Published Oct 24, 2024 • 18

  • LongVU: Spatiotemporal Adaptive Compression for Long Video-Language Understanding

    Paper • 2410.17434 • Published Oct 22, 2024 • 30
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs