-
In-Context Pretraining: Language Modeling Beyond Document Boundaries
Paper • 2310.10638 • Published • 28 -
Magicoder: Source Code Is All You Need
Paper • 2312.02120 • Published • 79 -
Parameter Efficient Tuning Allows Scalable Personalization of LLMs for Text Entry: A Case Study on Abbreviation Expansion
Paper • 2312.14327 • Published • 6 -
WaveCoder: Widespread And Versatile Enhanced Instruction Tuning with Refined Data Generation
Paper • 2312.14187 • Published • 49
Collections
Discover the best community collections!
Collections including paper arxiv:2401.02385
-
Approximating Two-Layer Feedforward Networks for Efficient Transformers
Paper • 2310.10837 • Published • 10 -
BitNet: Scaling 1-bit Transformers for Large Language Models
Paper • 2310.11453 • Published • 96 -
QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models
Paper • 2310.16795 • Published • 26 -
LLM-FP4: 4-Bit Floating-Point Quantized Transformers
Paper • 2310.16836 • Published • 13
-
MADLAD-400: A Multilingual And Document-Level Large Audited Dataset
Paper • 2309.04662 • Published • 22 -
Neurons in Large Language Models: Dead, N-gram, Positional
Paper • 2309.04827 • Published • 16 -
Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs
Paper • 2309.05516 • Published • 9 -
DrugChat: Towards Enabling ChatGPT-Like Capabilities on Drug Molecule Graphs
Paper • 2309.03907 • Published • 8