Beyond Memorization: The Challenge of Random Memory Access in Language Models Paper • 2403.07805 • Published Mar 12, 2024
Translating Natural Language to Planning Goals with Large-Language Models Paper • 2302.05128 • Published Feb 10, 2023
When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context Training Paper • 2411.13476 • Published Nov 20, 2024 • 16
Sailor2: Sailing in South-East Asia with Inclusive Multilingual LLMs Paper • 2502.12982 • Published Feb 18 • 18
SkyLadder: Better and Faster Pretraining via Context Window Scheduling Paper • 2503.15450 • Published Mar 19 • 12