Cost-Optimal Grouped-Query Attention for Long-Context LLMs Paper • 2503.09579 • Published Mar 12 • 5 • 2
Stuffed Mamba: State Collapse and State Capacity of RNN-Based Long-Context Modeling Paper • 2410.07145 • Published Oct 9, 2024 • 2 • 3