view article Article Transformers Are Getting Old: Variants and Alternatives Exist! By ProCreations • 3 days ago • 23
view article Article SmolLM3: smol, multilingual, long-context reasoner By loubnabnl and 22 others • about 17 hours ago • 47
view article Article SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data By danaaubakirova and 8 others • Jun 3 • 183
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning Paper • 2507.01006 • Published 7 days ago • 173
ERNIE 4.5 Collection collection of ERNIE 4.5 models. "-Paddle" models use PaddlePaddle weights, while "-PT" models use Transformer-style PyTorch weights. • 23 items • Updated 5 days ago • 143
view article Article Welcome the NVIDIA Llama Nemotron Nano VLM to Hugging Face Hub By nvidia and 11 others • 11 days ago • 25
Gemma 3n Collection Google Gemma 3n models, all versions including Dynamic GGUF, 4-bit, 16-bit and formats! • 10 items • Updated 6 days ago • 10
view article Article 🤔👀🎬🖥️📖 Kimi-VL-A3B-Thinking-2506: A Quick Navigation By moonshotai and 1 other • 17 days ago • 57
Avey 1 Research Preview Collection 1.5B preview models trained on 100B tokens of FineWeb, and an instruct-tuned version on smoltalk. • 3 items • Updated 22 days ago • 6
MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention Paper • 2506.13585 • Published 22 days ago • 250
MiniMax-M1 Collection MiniMax-M1, the world's first open-weight, large-scale hybrid-attention reasoning model. • 6 items • Updated 5 days ago • 106
DAPO: An Open-Source LLM Reinforcement Learning System at Scale Paper • 2503.14476 • Published Mar 18 • 132