Outlier-Safe Pre-Training for Robust 4-Bit Quantization of Large Language Models Paper β’ 2506.19697 β’ Published 6 days ago β’ 43 β’ 3
What Matters in Transformers? Not All Attention is Needed Paper β’ 2406.15786 β’ Published Jun 22, 2024 β’ 32 β’ 3