Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context Paper ⢠1901.02860 ⢠Published Jan 9, 2019 ⢠4
Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention Paper ⢠2404.07143 ⢠Published Apr 10, 2024 ⢠111
view article Article Fine-Tuning Your First Large Language Model (LLM) with PyTorch and Hugging Face By dvgodoy ⢠Feb 11 ⢠55
view article Article Small Language Models (SLM): A Comprehensive Overview By jjokah ⢠Feb 22 ⢠52
The Pile: An 800GB Dataset of Diverse Text for Language Modeling Paper ⢠2101.00027 ⢠Published Dec 31, 2020 ⢠7
Direct Preference Optimization: Your Language Model is Secretly a Reward Model Paper ⢠2305.18290 ⢠Published May 29, 2023 ⢠63
view article Article Making LLMs Smaller Without Breaking Them: A GLU-Aware Pruning Approach By oopere ⢠Nov 24, 2024 ⢠10