PixMo Collection A set of vision-language datasets built by Ai2 and used to train the Molmo family of models. Read more at https://molmo.allenai.org/blog • 9 items • Updated 28 days ago • 50
Think Big, Generate Quick: LLM-to-SLM for Fast Autoregressive Decoding Paper • 2402.16844 • Published Feb 26
MSViT: Dynamic Mixed-Scale Tokenization for Vision Transformers Paper • 2307.02321 • Published Jul 5, 2023 • 7