view article Article Cohere on Hugging Face Inference Providers ๐ฅ By burtenshaw and 6 others โข Apr 16 โข 127
view article Article Making LLMs Smaller Without Breaking Them: A GLU-Aware Pruning Approach By oopere โข Nov 24, 2024 โข 7
Dolphin 3.0 Collection Dolphin 3.0 is the next generation of the Dolphin series of instruct-tuned models. Designed to be the ultimate general purpose local model. โข 9 items โข Updated Feb 7 โข 174
Unsloth 4-bit Dynamic Quants Collection Unsloths Dynamic 4bit Quants selectively skips quantizing certain parameters; greatly improving accuracy while only using <10% more VRAM than BnB 4bit โข 28 items โข Updated 9 days ago โข 83