Hymba: A Hybrid-head Architecture for Small Language Models Paper • 2411.13676 • Published Nov 20 • 39
Star Attention: Efficient LLM Inference over Long Sequences Paper • 2411.17116 • Published about 1 month ago • 47
RULER: What's the Real Context Size of Your Long-Context Language Models? Paper • 2404.06654 • Published Apr 9 • 34
Do Long-Range Language Models Actually Use Long-Range Context? Paper • 2109.09115 • Published Sep 19, 2021
RULER: What's the Real Context Size of Your Long-Context Language Models? Paper • 2404.06654 • Published Apr 9 • 34
Efficiently Upgrading Multilingual Machine Translation Models to Support More Languages Paper • 2302.03528 • Published Feb 7, 2023
PEARL: Prompting Large Language Models to Plan and Execute Actions Over Long Documents Paper • 2305.14564 • Published May 23, 2023 • 1