-
Navigating the Digital World as Humans Do: Universal Visual Grounding for GUI Agents
Paper • 2410.05243 • Published • 18 -
GPT-4V(ision) is a Generalist Web Agent, if Grounded
Paper • 2401.01614 • Published • 22 -
osunlp/UGround
Image-Text-to-Text • Updated • 1.5k • 21 -
osunlp/UGround-V1-2B
Image-Text-to-Text • Updated • 1.04k • 7
Boyu Gou
BoyuNLP
AI & ML interests
AI Agents, Foundation Models, GUI Agents
Recent Activity
new activity
about 7 hours ago
osunlp/UGround-V1-Data:Null byte error
new activity
about 7 hours ago
Qwen/Qwen2.5-VL-3B-Instruct:A minor typo of the score on AndroidWorld
liked
a dataset
3 days ago
osunlp/UGround-V1-Data
Organizations
Collections
1
models
None public yet