A Thorough Examination of Decoding Methods in the Era of LLMs Paper • 2402.06925 • Published Feb 10 • 1
Byte Latent Transformer: Patches Scale Better Than Tokens Paper • 2412.09871 • Published 13 days ago • 75