10 17 4

Ye Liu

yeliudev

anton96vice's profile picture

breakzz's profile picture

ari0312's profile picture

https://yeliu.dev/

yeliudev
yeliudev

AI & ML interests

Vision & Language

Recent Activity

upvoted a paper 13 days ago

Code2World: A GUI World Model via Renderable Code Generation

updated a model 29 days ago

yeliudev/VideoMind-2B-FT-QVHighlights

updated a dataset 29 days ago

yeliudev/VideoMind-Dataset

View all activity

Organizations

yeliudev 's collections 4

VideoMind

[ICLR 2026] VideoMind: A Chain-of-LoRA Agent for Temporal-Grounded Video Reasoning

Running on Zero

37

VideoMind 2B

💡

37

A Chain-of-LoRA Agent for Temporal-Grounded Video Reasoning
yeliudev/VideoMind-2B

Video-Text-to-Text • Updated 29 days ago • 22 • 2
yeliudev/VideoMind-7B

Video-Text-to-Text • Updated 29 days ago • 13 • 4
yeliudev/VideoMind-Dataset

Preview • Updated 29 days ago • 3.06k • 11

E.T. Bench

[NeurIPS 2024] E.T. Bench: Towards Open-Ended Event-Level Video-Language Understanding

PolyU-ChenLab/ETBench

Viewer • Updated Oct 29, 2024 • 5 • 104 • 4
PolyU-ChenLab/ET-Instruct-164K

Viewer • Updated Sep 27, 2024 • 115k • 283 • 4
PolyU-ChenLab/ETChat-Phi3-Mini-Stage-1

Video-Text-to-Text • 5B • Updated Oct 29, 2024 • 1 • 2
PolyU-ChenLab/ETChat-Phi3-Mini-Stage-2

5B • Updated Sep 27, 2024 • 753

UniPixel

[NeurIPS 2025] UniPixel: Unified Object Referring and Segmentation for Pixel-Level Visual Reasoning

Running on Zero

6

UniPixel

🔮

6

An MLLM for Unified Object Referring and Segmentation
PolyU-ChenLab/UniPixel-3B

Video-Text-to-Text • 4B • Updated Oct 4, 2025 • 239 • 3
PolyU-ChenLab/UniPixel-7B

Video-Text-to-Text • 8B • Updated Oct 22, 2025 • 90 • 1
PolyU-ChenLab/UniPixel-SFT-1M

Preview • Updated Oct 4, 2025 • 1.06k • 2

R2-Tuning

[ECCV 2024] R2-Tuning: Efficient Image-to-Video Transfer Learning for Video Temporal Grounding

Running

6

R2-Tuning

🌀

6

[ECCV 2024] Localizing moments in videos via text queries
yeliudev/R2-Tuning

Updated Apr 17, 2024 • 2
R^2-Tuning: Efficient Image-to-Video Transfer Learning for Video Temporal Grounding

Paper • 2404.00801 • Published Mar 31, 2024 • 1

VideoMind

[ICLR 2026] VideoMind: A Chain-of-LoRA Agent for Temporal-Grounded Video Reasoning

Running on Zero

37

VideoMind 2B

💡

37

A Chain-of-LoRA Agent for Temporal-Grounded Video Reasoning
yeliudev/VideoMind-2B

Video-Text-to-Text • Updated 29 days ago • 22 • 2
yeliudev/VideoMind-7B

Video-Text-to-Text • Updated 29 days ago • 13 • 4
yeliudev/VideoMind-Dataset

Preview • Updated 29 days ago • 3.06k • 11

UniPixel

[NeurIPS 2025] UniPixel: Unified Object Referring and Segmentation for Pixel-Level Visual Reasoning

Running on Zero

6

UniPixel

🔮

6

An MLLM for Unified Object Referring and Segmentation
PolyU-ChenLab/UniPixel-3B

Video-Text-to-Text • 4B • Updated Oct 4, 2025 • 239 • 3
PolyU-ChenLab/UniPixel-7B

Video-Text-to-Text • 8B • Updated Oct 22, 2025 • 90 • 1
PolyU-ChenLab/UniPixel-SFT-1M

Preview • Updated Oct 4, 2025 • 1.06k • 2

E.T. Bench

[NeurIPS 2024] E.T. Bench: Towards Open-Ended Event-Level Video-Language Understanding

PolyU-ChenLab/ETBench

Viewer • Updated Oct 29, 2024 • 5 • 104 • 4
PolyU-ChenLab/ET-Instruct-164K

Viewer • Updated Sep 27, 2024 • 115k • 283 • 4
PolyU-ChenLab/ETChat-Phi3-Mini-Stage-1

Video-Text-to-Text • 5B • Updated Oct 29, 2024 • 1 • 2
PolyU-ChenLab/ETChat-Phi3-Mini-Stage-2

5B • Updated Sep 27, 2024 • 753

R2-Tuning

[ECCV 2024] R2-Tuning: Efficient Image-to-Video Transfer Learning for Video Temporal Grounding

Running

6

R2-Tuning

🌀

6

[ECCV 2024] Localizing moments in videos via text queries
yeliudev/R2-Tuning

Updated Apr 17, 2024 • 2
R^2-Tuning: Efficient Image-to-Video Transfer Learning for Video Temporal Grounding

Paper • 2404.00801 • Published Mar 31, 2024 • 1

Ye Liu

AI & ML interests

Recent Activity

Organizations

yeliudev 's collections 4

VideoMind 2B

UniPixel

R2-Tuning

VideoMind 2B

UniPixel

R2-Tuning