Large Action Models: From Inception to Implementation Paper • 2412.10047 • Published 13 days ago • 28
Apollo: An Exploration of Video Understanding in Large Multimodal Models Paper • 2412.10360 • Published 12 days ago • 131
Smaller Language Models Are Better Instruction Evolvers Paper • 2412.11231 • Published 11 days ago • 25
Byte Latent Transformer: Patches Scale Better Than Tokens Paper • 2412.09871 • Published 13 days ago • 75
FastVLM: Efficient Vision Encoding for Vision Language Models Paper • 2412.13303 • Published 8 days ago • 13
Thinking in Space: How Multimodal Large Language Models See, Remember, and Recall Spaces Paper • 2412.14171 • Published 7 days ago • 22
Descriptive Caption Enhancement with Visual Specialists for Multimodal Perception Paper • 2412.14233 • Published 7 days ago • 6
AceMath: Advancing Frontier Math Reasoning with Post-Training and Reward Modeling Paper • 2412.15084 • Published 6 days ago • 12
Falcon3 Collection Falcon3 family of Open Foundation Models is a set of pretrained and instruct LLMs ranging from 1B to 10B parameters. • 40 items • Updated 7 days ago • 71
Qwen2.5-Coder Collection Code-specific model series based on Qwen2.5 • 40 items • Updated 28 days ago • 257
Infinity-MM: Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data Paper • 2410.18558 • Published Oct 24 • 18
Granite 3.0 Language Models Collection A series of language models trained by IBM licensed under Apache 2.0 license. We release both the base pretrained and instruct models. • 8 items • Updated 8 days ago • 96