- Code release: https://github.com/microsoft/torchscale - Sep 2022: accepted by NeurIPS 2022 - April 2022: release preprint of **X-MoE** - [On the Representation Collapse of Sparse Mixture of Experts](https://arxiv.org/abs/2204.09179) ``` @inproceedings{xmoe, title={On the Representation Collapse of Sparse Mixture of Experts}, author={Zewen Chi and Li Dong and Shaohan Huang and Damai Dai and Shuming Ma and Barun Patra and Saksham Singhal and Payal Bajaj and Xia Song and Xian-Ling Mao and Heyan Huang and Furu Wei}, booktitle={Advances in Neural Information Processing Systems}, year={2022}, url={https://openreview.net/forum?id=mWaYC6CZf5} } ```