Unsloth Dynamic 2.0 Quants Collection New 2.0 version of our Dynamic GGUF + Quants. Dynamic 2.0 achieves superior accuracy & SOTA quantization performance. • 43 items • Updated 4 days ago • 167
Llama 3.1 Collection Collection Meta's Llama 3.1 models including 8B, 70B, 405B. Includes 4-bit bnb and original versions. • 13 items • Updated 6 days ago • 6
Gemma 3 QAT Collection Quantization Aware Trained (QAT) Gemma 3 checkpoints. The model preserves similar quality as half precision while using 3x less memory • 15 items • Updated 26 days ago • 207
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models Paper • 2501.03262 • Published Jan 4 • 100
Deliberation in Latent Space via Differentiable Cache Augmentation Paper • 2412.17747 • Published Dec 23, 2024 • 33
Smaller Language Models Are Better Instruction Evolvers Paper • 2412.11231 • Published Dec 15, 2024 • 29
Merging in a Bottle: Differentiable Adaptive Merging (DAM) and the Path from Averaging to Automation Paper • 2410.08371 • Published Oct 10, 2024 • 2
GGUF Llama-3.2-Instruct-OQ8_0-F32.EF32.IQ4_K-Q8_0 IQuants Collection Custom GGUF quants of Meta’s Llama-3.2-Instruct's finetunes, where the Output Tensors are quantized to Q8_0 or F32 and the Embeddings are kept @F32 • 3 items • Updated Dec 13, 2024 • 2
Llama 3.2 Collection This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 • 15 items • Updated Dec 6, 2024 • 627
Qwen2.5-Coder Collection Code-specific model series based on Qwen2.5 • 40 items • Updated 14 days ago • 331
Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing Paper • 2406.08464 • Published Jun 12, 2024 • 70
Recent highlights Collection Some recent models worth checking out • 18 items • Updated Nov 1, 2024 • 53
Qwen2.5 Collection Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 46 items • Updated 14 days ago • 630
BESA: Pruning Large Language Models with Blockwise Parameter-Efficient Sparsity Allocation Paper • 2402.16880 • Published Feb 18, 2024 • 2
Llama 3.1 Collection This collection hosts the transformers and original repos of the Llama 3.1, Llama Guard 3 and Prompt Guard models • 11 items • Updated Dec 6, 2024 • 682
LLM Compiler Collection Meta LLM Compiler is a state-of-the-art LLM that builds upon Code Llama with improved performance for code optimization and compiler reasoning. • 4 items • Updated Jun 27, 2024 • 151
view article Article Expanding Model Context and Creating Chat Models with a Single Click By maywell • Apr 28, 2024 • 38
Honorable mentions Collection Some models I've made and I liked but isn't part of a serie. • 10 items • Updated Feb 4, 2024 • 6