-
FocusedAD: Character-centric Movie Audio Description
Paper β’ 2504.12157 β’ Published β’ 9 -
Pixel-SAIL: Single Transformer For Pixel-Grounded Understanding
Paper β’ 2504.10465 β’ Published β’ 28 -
PerceptionLM: Open-Access Data and Models for Detailed Visual Understanding
Paper β’ 2504.13180 β’ Published β’ 17 -
OS-Copilot/OS-Atlas-Base-7B
Image-Text-to-Text β’ Updated β’ 2.54k β’ 38
Quang Huy
NothingLQH
Β·
AI & ML interests
None yet
Recent Activity
updated
a collection
about 7 hours ago
ORC
updated
a collection
about 10 hours ago
VLM
updated
a collection
about 11 hours ago
Image
Organizations
None yet
Collections
20
models
0
None public yet
datasets
0
None public yet