Byte Latent Transformer: Patches Scale Better Than Tokens Paper • 2412.09871 • Published 13 days ago • 75
Adaptive Length Image Tokenization via Recurrent Allocation Paper • 2411.02393 • Published Nov 4 • 12
view article Article A failed experiment: Infini-Attention, and why we should keep trying? Aug 14 • 53