view article Article Enhance Your Models in 5 Minutes with the Hugging Face Kernel Hub By drbh and 6 others • Jun 12 • 120
Efficient Memory Management for Large Language Model Serving with PagedAttention Paper • 2309.06180 • Published Sep 12, 2023 • 25