Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities Paper • 2505.02567 • Published 7 days ago • 64
A Survey on Inference Engines for Large Language Models: Perspectives on Optimization and Efficiency Paper • 2505.01658 • Published 9 days ago • 30
Low-Precision Training of Large Language Models: Methods, Challenges, and Opportunities Paper • 2505.01043 • Published 10 days ago • 9
Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction Paper • 2404.02905 • Published Apr 3, 2024 • 71
MiniMax-01: Scaling Foundation Models with Lightning Attention Paper • 2501.08313 • Published Jan 14 • 290
Kimi k1.5: Scaling Reinforcement Learning with LLMs Paper • 2501.12599 • Published Jan 22 • 115
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published Jan 22 • 390
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper • 2502.02737 • Published Feb 4 • 229
Sleep-time Compute: Beyond Inference Scaling at Test-time Paper • 2504.13171 • Published 24 days ago • 15
MobileLLM Collection Optimizing Sub-billion Parameter Language Models for On-Device Use Cases (ICML 2024) https://arxiv.org/abs/2402.14905 • 9 items • Updated Nov 27, 2024 • 114
SmolVLM: Redefining small and efficient multimodal models Paper • 2504.05299 • Published Apr 7 • 181
OmniSVG: A Unified Scalable Vector Graphics Generation Model Paper • 2504.06263 • Published Apr 8 • 160
Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems Paper • 2504.01990 • Published Mar 31 • 276
Wan: Open and Advanced Large-Scale Video Generative Models Paper • 2503.20314 • Published Mar 26 • 51
A Comprehensive Survey on Long Context Language Modeling Paper • 2503.17407 • Published Mar 20 • 49