ACECODER: Acing Coder RL via Automated Test-Case Synthesis Paper • 2502.01718 • Published 23 days ago • 28
Mantis-VL/qwen2-vl-video-eval_st_r2k_bad8k_49152_regression Text Classification • Updated Dec 22, 2024 • 11
Mantis-VL/qwen2-vl-video-eval_st_r2k_bad5k_49152_regression Text Classification • Updated Dec 22, 2024 • 9
Mantis-VL/qwen2-vl-video-eval_st_bad8k_49152_regression Text Classification • Updated Dec 22, 2024 • 12
Mantis-VL/qwen2-vl-video-eval_st_bad5k_49152_regression Text Classification • Updated Dec 19, 2024 • 12
Mantis-VL/qwen2-vl-video-eval_st_bad8k_55296_regression Text Classification • Updated Dec 19, 2024 • 19
Mantis-VL/qwen2-vl-video-eval_st_bad5k_55296_regression Text Classification • Updated Dec 19, 2024 • 12
Mantis-VL/qwen2-vl-video-eval_st_r2k_bad8k_61440_regression Text Classification • Updated Dec 18, 2024 • 5
Mantis-VL/qwen2-vl-video-eval_st_r2k_bad5k_61440_regression Text Classification • Updated Dec 18, 2024 • 13
Mantis-VL/qwen2-vl-video-eval_st_bad5k_61440_regression Text Classification • Updated Dec 18, 2024 • 12
Mantis-VL/qwen2-vl-video-eval_st_bad8k_61440_regression Text Classification • Updated Dec 18, 2024 • 13
MEGA-Bench: Scaling Multimodal Evaluation to over 500 Real-World Tasks Paper • 2410.10563 • Published Oct 14, 2024 • 39
MantisScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation Paper • 2406.15252 • Published Jun 21, 2024 • 16
MantisScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation Paper • 2406.15252 • Published Jun 21, 2024 • 16