Auto Cherry-Picker: Learning from High-quality Generative Data Driven by Language Paper • 2406.20085 • Published Jun 28, 2024 • 12
An Open and Comprehensive Pipeline for Unified Object Grounding and Detection Paper • 2401.02361 • Published Jan 4, 2024
VisualPRM: An Effective Process Reward Model for Multimodal Reasoning Paper • 2503.10291 • Published 21 days ago • 32
Creation-MMBench: Assessing Context-Aware Creative Intelligence in MLLM Paper • 2503.14478 • Published 16 days ago • 42
Creation-MMBench: Assessing Context-Aware Creative Intelligence in MLLM Paper • 2503.14478 • Published 16 days ago • 42
VisualPRM: An Effective Process Reward Model for Multimodal Reasoning Paper • 2503.10291 • Published 21 days ago • 32
Difix3D+: Improving 3D Reconstructions with Single-Step Diffusion Models Paper • 2503.01774 • Published about 1 month ago • 41
Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs Paper • 2503.01743 • Published about 1 month ago • 80