Transformer-Lite: High-efficiency Deployment of Large Language Models on Mobile Phone GPUs Paper โข 2403.20041 โข Published Mar 29 โข 34