One Vision-Language-Action Model for GUI Agent
Qinghong (Kevin) Lin
KevinQHLin
AI & ML interests
Vision-Language Model, Video Understanding, Human-AI Interaction
Recent Activity
liked
a Space
15 days ago
chenjoya/LiveCC
upvoted
a
paper
22 days ago
LiveCC: Learning Video LLM with Streaming Speech Transcription at Scale