Running 2.42k 2.42k The Ultra-Scale Playbook 🌌 The ultimate guide to training LLM on large GPU Clusters
MM-RLHF: The Next Step Forward in Multimodal LLM Alignment Paper • 2502.10391 • Published Feb 14 • 34
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention Paper • 2502.11089 • Published Feb 16 • 151
Fast Inference from Transformers via Speculative Decoding Paper • 2211.17192 • Published Nov 30, 2022 • 5