FlashFormer: Whole-Model Kernels for Efficient Low-Batch Inference Paper • 2505.22758 • Published 10 days ago
PaTH Attention: Position Encoding via Accumulating Householder Transformations Paper • 2505.16381 • Published 17 days ago