view article Article Bamba: Inference-Efficient Hybrid Mamba2 Model By rganti and 28 others • Dec 18, 2024 • 49
view article Article Improving Hugging Face Training Efficiency Through Packing with Flash Attention By lwtr and 5 others • Aug 21, 2024 • 33