videoscore2/vs2_qwen3vl_sft_27k_5e-5_2fps_960_720_8192 Image-to-Text β’ 770k β’ Updated Dec 8, 2025 β’ 2
videoscore2/vs2_internvl3_5_sft_27k_5e-5_2fps_960_720_8192 Any-to-Any β’ 695k β’ Updated Dec 8, 2025 β’ 2
videoscore2/vs2_qwen3vl_sft_27k_5e-5_2fps_960_720_8192 Image-to-Text β’ 770k β’ Updated Dec 8, 2025 β’ 2
videoscore2/vs2_internvl3_5_sft_27k_5e-5_2fps_960_720_8192 Any-to-Any β’ 695k β’ Updated Dec 8, 2025 β’ 2
TUNA: Taming Unified Visual Representations for Native Unified Multimodal Models Paper β’ 2512.02014 β’ Published Dec 1, 2025 β’ 71
BrowserAgent: Building Web Agents with Human-Inspired Web Browsing Actions Paper β’ 2510.10666 β’ Published Oct 12, 2025 β’ 27
TIGER-Lab/VideoScore2-RL-no-SFT Visual Question Answering β’ 8B β’ Updated Oct 13, 2025 β’ 8 β’ 1