Bamba Collection Collection of Bamba - hybrid Mamba2 model architecture based models trained on open data • 8 items • Updated 8 days ago • 16
view article Article Improving Hugging Face Training Efficiency Through Packing with Flash Attention Aug 21 • 25