New Transformers or alternatives Differential Transformer Paper • 2410.05258 • Published Oct 7, 2024 • 180
Transformer-based Models for Computer Vision MIO: A Foundation Model on Multimodal Tokens Paper • 2409.17692 • Published Sep 26, 2024 • 54 An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale Paper • 2010.11929 • Published Oct 22, 2020 • 11 Going deeper with Image Transformers Paper • 2103.17239 • Published Mar 31, 2021 Training data-efficient image transformers & distillation through attention Paper • 2012.12877 • Published Dec 23, 2020 • 2
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale Paper • 2010.11929 • Published Oct 22, 2020 • 11
Training data-efficient image transformers & distillation through attention Paper • 2012.12877 • Published Dec 23, 2020 • 2
New Transformers or alternatives Differential Transformer Paper • 2410.05258 • Published Oct 7, 2024 • 180
Transformer-based Models for Computer Vision MIO: A Foundation Model on Multimodal Tokens Paper • 2409.17692 • Published Sep 26, 2024 • 54 An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale Paper • 2010.11929 • Published Oct 22, 2020 • 11 Going deeper with Image Transformers Paper • 2103.17239 • Published Mar 31, 2021 Training data-efficient image transformers & distillation through attention Paper • 2012.12877 • Published Dec 23, 2020 • 2
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale Paper • 2010.11929 • Published Oct 22, 2020 • 11
Training data-efficient image transformers & distillation through attention Paper • 2012.12877 • Published Dec 23, 2020 • 2