Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning Paper • 2505.03318 • Published 1 day ago • 63
Optimizing Chain-of-Thought Reasoners via Gradient Variance Minimization in Rejection Sampling and RL Paper • 2505.02391 • Published 3 days ago • 21
Step1X-Edit: A Practical Framework for General Image Editing Paper • 2504.17761 • Published 13 days ago • 86
A Minimalist Approach to LLM Reasoning: from Rejection Sampling to Reinforce Paper • 2504.11343 • Published 22 days ago • 16
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models Paper • 2504.10479 • Published 23 days ago • 255
Pangu Ultra: Pushing the Limits of Dense Large Language Models on Ascend NPUs Paper • 2504.07866 • Published 27 days ago • 10
VisualPRM: An Effective Process Reward Model for Multimodal Reasoning Paper • 2503.10291 • Published Mar 13 • 36
An Empirical Study of GPT-4o Image Generation Capabilities Paper • 2504.05979 • Published 29 days ago • 62
Skywork R1V: Pioneering Multimodal Reasoning with Chain-of-Thought Paper • 2504.05599 • Published 30 days ago • 81
OmniSVG: A Unified Scalable Vector Graphics Generation Model Paper • 2504.06263 • Published 29 days ago • 159
One-Minute Video Generation with Test-Time Training Paper • 2504.05298 • Published about 1 month ago • 102
view article Article Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM Mar 12 • 406