Inference Optimization of Foundation Models on AI Accelerators Paper • 2407.09111 • Published Jul 12, 2024
A Survey on Inference Optimization Techniques for Mixture of Experts Models Paper • 2412.14219 • Published Dec 18, 2024