Harnessing the Reasoning Economy: A Survey of Efficient Reasoning for Large Language Models Paper • 2503.24377 • Published Mar 31 • 17
view article Article A Gentle Introduction to 8-bit Matrix Multiplication for transformers at scale using transformers, accelerate and bitsandbytes By ybelkada and 1 other • Aug 17, 2022 • 94