Task-level Distributionally Robust Optimization for Large Language Model-based Dense Retrieval Paper • 2408.10613 • Published Aug 20, 2024
CartesianMoE: Boosting Knowledge Sharing among Experts via Cartesian Product Routing in Mixture-of-Experts Paper • 2410.16077 • Published Oct 21, 2024 • 1