Fanheng Kong's picture

Fanheng Kong

friedrichor

·

https://friedrichor.github.io/

friedrichor

AI & ML interests

Multimodal LLM, LLM, Vibe Coding

Recent Activity

authored a paper 20 days ago

Evaluating Multimodal Large Language Models on Video Captioning via Monte Carlo Tree Search

authored a paper 20 days ago

Accelerating Diffusion LLM Inference via Local Determinism Propagation

authored a paper 20 days ago

WebTestBench: Evaluating Computer-Use Agents towards End-to-End Automated Web Testing

View all activity

Organizations

Collections 1

Papers 6

arxiv:2603.25226

arxiv:2510.07081

arxiv:2506.11155

arxiv:2505.20124

models 5

friedrichor/Unite-Instruct-Qwen2-VL-7B

Feature Extraction • 8B • Updated Jun 10, 2025 • 72

friedrichor/Unite-Instruct-Qwen2-VL-2B

Feature Extraction • 2B • Updated Jun 10, 2025 • 16 • 1

friedrichor/Unite-Base-Qwen2-VL-7B

Feature Extraction • 8B • Updated Jun 10, 2025 • 146 • 1

friedrichor/Unite-Base-Qwen2-VL-2B

Feature Extraction • 2B • Updated Jun 10, 2025 • 243

friedrichor/stable-diffusion-2-1-realistic

Text-to-Image • Updated Mar 19, 2025 • 10 • 5

datasets 10

friedrichor/WebTestBench

Viewer • Updated 21 days ago • 100 • 182

friedrichor/DiDeMo

Viewer • Updated Jul 23, 2025 • 9.4k • 861 • 10

friedrichor/ActivityNet_Captions

Viewer • Updated Jul 23, 2025 • 19.8k • 1.31k • 10

friedrichor/Unite-Instruct-Retrieval-Train

Viewer • Updated Jun 19, 2025 • 1.27M • 96 • 1

friedrichor/Unite-Base-Retrieval-Train

Viewer • Updated Jun 19, 2025 • 6.38M • 320

friedrichor/TUNA-Bench

Viewer • Updated Jun 11, 2025 • 3.43k • 139

friedrichor/MSVD

Viewer • Updated May 20, 2025 • 1.97k • 560 • 6

friedrichor/MSR-VTT

Viewer • Updated May 20, 2025 • 17k • 1.62k • 16

friedrichor/PhotoChat_image

Viewer • Updated Jul 3, 2023 • 8.54k • 57 • 2

friedrichor/PhotoChat_120_square_HQ

Viewer • Updated May 31, 2023 • 120 • 20