VLM - a NothingLQH Collection

NothingLQH 's Collections

VLM

ORC

Code

Speech

Prompt

Story

NLP

Anime

3D

Video

DatasetLanguage

Vistral-7B-Chat

Image

LLM

VLM

updated 1 day ago

FocusedAD: Character-centric Movie Audio Description

Paper • 2504.12157 • Published about 1 month ago • 9
Pixel-SAIL: Single Transformer For Pixel-Grounded Understanding

Paper • 2504.10465 • Published Apr 14 • 28
PerceptionLM: Open-Access Data and Models for Detailed Visual Understanding

Paper • 2504.13180 • Published 29 days ago • 17
OS-Copilot/OS-Atlas-Base-7B

Image-Text-to-Text • Updated Nov 19, 2024 • 2.54k • 38
google/siglip-so400m-patch14-384

Zero-Shot Image Classification • Updated Sep 26, 2024 • 7.25M • 536