SageAttention2 Technical Report: Accurate 4 Bit Attention for Plug-and-play Inference Acceleration Paper โข 2411.10958 โข Published Nov 17, 2024 โข 56 โข 9