SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training Paper • 2501.17161 • Published 7 days ago • 98
Reasoning Datasets Collection Distilled synthetic Reasoning datasets • 7 items • Updated 3 days ago • 42
Eagle 2 Collection Eagle 2 is a family of frontier vision-language models with vision-centric design. The model supports 4K HD input, long-context video, and grounding. • 9 items • Updated 13 days ago • 29
view article Article Train 400x faster Static Embedding Models with Sentence Transformers 21 days ago • 132
Deepthink and Reasoning Collection Best for Deepthink and Reasoning • 14 items • Updated 11 days ago • 16
ModernBERT Collection Bringing BERT into modernity via both architecture changes and scaling • 3 items • Updated Dec 19, 2024 • 132
Bamba Collection Collection of Bamba - hybrid Mamba2 model architecture based models trained on open data • 8 items • Updated Dec 18, 2024 • 18
Falcon3 Collection Falcon3 family of Open Foundation Models is a set of pretrained and instruct LLMs ranging from 1B to 10B parameters. • 40 items • Updated 28 days ago • 80
PaliGemma 2 Release Collection Vision-Language Models available in multiple 3B, 10B and 28B variants. • 23 items • Updated Dec 13, 2024 • 134
🔱 Sailor2 Language Models Collection Sailing in South-East Asia with Inclusive Multilingual LLMs • 9 items • Updated Dec 3, 2024 • 22