MixLLM: LLM Quantization with Global Mixed-precision between Output-features and Highly-efficient System Design Paper • 2412.14590 • Published 7 days ago • 11
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference Paper • 2412.13663 • Published 8 days ago • 103
Apollo: An Exploration of Video Understanding in Large Multimodal Models Paper • 2412.10360 • Published 12 days ago • 131
POINTS1.5: Building a Vision-Language Model towards Real World Applications Paper • 2412.08443 • Published 15 days ago • 38
ProcessBench: Identifying Process Errors in Mathematical Reasoning Paper • 2412.06559 • Published 17 days ago • 68
CompCap: Improving Multimodal Large Language Models with Composite Captions Paper • 2412.05243 • Published 19 days ago • 18
Florence-VL: Enhancing Vision-Language Models with Generative Vision Encoder and Depth-Breadth Fusion Paper • 2412.04424 • Published 20 days ago • 55
PaliGemma 2: A Family of Versatile VLMs for Transfer Paper • 2412.03555 • Published 21 days ago • 118
PaliGemma 2 Release Collection Vision-Language Models available in multiple 3B, 10B and 28B variants. • 23 items • Updated 12 days ago • 119
Puzzle: Distillation-Based NAS for Inference-Optimized LLMs Paper • 2411.19146 • Published 28 days ago • 13