Multimodal Contrastive Learning with LIMoE: the Language-Image Mixture of Experts Paper • 2206.02770 • Published Jun 6, 2022 • 3
Experts Weights Averaging: A New General Training Scheme for Vision Transformers Paper • 2308.06093 • Published Aug 11, 2023 • 2
Patch-level Routing in Mixture-of-Experts is Provably Sample-efficient for Convolutional Neural Networks Paper • 2306.04073 • Published Jun 7, 2023 • 2
A Mixture of Expert Approach for Low-Cost Customization of Deep Neural Networks Paper • 1811.00056 • Published Oct 31, 2018 • 2
Balanced Mixture of SuperNets for Learning the CNN Pooling Architecture Paper • 2306.11982 • Published Jun 21, 2023 • 2
Long-tailed Recognition by Routing Diverse Distribution-Aware Experts Paper • 2010.01809 • Published Oct 5, 2020 • 2
MoE-LLaVA: Mixture of Experts for Large Vision-Language Models Paper • 2401.15947 • Published Jan 29 • 48