ERNIE 4.5 Collection collection of ERNIE 4.5 models. "-Paddle" models use PaddlePaddle weights, while "-PT" models use Transformer-style PyTorch weights. • 25 items • Updated 9 days ago • 151
SRFT: A Single-Stage Method with Supervised and Reinforcement Fine-Tuning for Reasoning Paper • 2506.19767 • Published 25 days ago • 13
Autoregressive Semantic Visual Reconstruction Helps VLMs Understand Better Paper • 2506.09040 • Published Jun 10 • 36
SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics Paper • 2506.01844 • Published Jun 2 • 115
InfantAgent-Next: A Multimodal Generalist Agent for Automated Computer Interaction Paper • 2505.10887 • Published May 16 • 10
Vibe Coding vs. Agentic Coding: Fundamentals and Practical Implications of Agentic AI Paper • 2505.19443 • Published May 26 • 15
Pixel Reasoner: Incentivizing Pixel-Space Reasoning with Curiosity-Driven Reinforcement Learning Paper • 2505.15966 • Published May 21 • 53
OpenCodeReasoning-II Collection Reasoning data for supervised finetuning of LLMs to advance code generation and critique • 5 items • Updated 9 days ago • 8
BLIP3-o: A Family of Fully Open Unified Multimodal Models-Architecture, Training and Dataset Paper • 2505.09568 • Published May 14 • 95
Benchmarking Multimodal Mathematical Reasoning with Explicit Visual Dependency Paper • 2504.18589 • Published Apr 24 • 13
HyperCLOVA X SEED Collection HyperCLOVA X SEED is NAVER's lightweight open-source lineup with a strong focus on Korean language performance • 3 items • Updated Apr 24 • 27
Eagle 2.5: Boosting Long-Context Post-Training for Frontier Vision-Language Models Paper • 2504.15271 • Published Apr 21 • 66
MIG: Automatic Data Selection for Instruction Tuning by Maximizing Information Gain in Semantic Space Paper • 2504.13835 • Published Apr 18 • 38