Continual Quantization-Aware Pre-Training: When to transition from 16-bit to 1.58-bit pre-training for BitNet language models? Paper • 2502.11895 • Published Feb 17 • 2
An Extra RMSNorm is All You Need for Fine Tuning to 1.58 Bits Paper • 2505.08823 • Published May 12 • 2
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits Paper • 2402.17764 • Published Feb 27, 2024 • 622
Byte Latent Transformer: Patches Scale Better Than Tokens Paper • 2412.09871 • Published Dec 13, 2024 • 106
llama.vim Collection Recommended models for the llama.vim and llama.vscode plugins • 9 items • Updated May 14 • 43
story writing favourites Collection Models I personally liked for generating stories in the past. Not a recommendation, most of these are outdated. • 24 items • Updated Jun 11 • 59
Spectra: A Comprehensive Study of Ternary, Quantized, and FP16 Language Models Paper • 2407.12327 • Published Jul 17, 2024 • 80
LongVA Collection Long Context Transfer From Text To Vision: https://lmms-lab.github.io/posts/longva/ • 5 items • Updated Oct 4, 2024 • 13
Better & Faster Large Language Models via Multi-token Prediction Paper • 2404.19737 • Published Apr 30, 2024 • 78
Extending Context Window of Large Language Models via Positional Interpolation Paper • 2306.15595 • Published Jun 27, 2023 • 53