kaitchup
/

Qwen2.5-72B-Instruct-autoround-2bit-64g-2048-gptq

Model card Files Files and versions Community

bnjmnmarie commited on May 7

Commit

95420e8

·

verified ·

1 Parent(s): 1700227

Update README.md

Files changed (1) hide show

README.md +23 -3

README.md CHANGED Viewed

@@ -1,3 +1,23 @@
----
-license: apache-2.0
----

+---
+license: apache-2.0
+base_model:
+- Qwen/Qwen2.5-72B-Instruct
+tags:
+- autoround
+---
+This is [Qwen/Qwen2.5-72B-Instruct](https://huggingface.co/Qwen/Qwen2.5-72B-Instruct) quantized with [AutoRound](https://github.com/intel/auto-round/tree/main/auto_round) in 2-bit (symmetric + gptq format) with a group size of 64 and calibration samples of 2048 tokens. The model has been created, tested, and evaluated by The Kaitchup.
+The model is compatible with vLLM and Transformers.
+More details in this article:
+[Accurate 2-bit Quantization: Run Massive LLMs on a Single Consumer GPU](https://kaitchup.substack.com/p/accurate-2-bit-quantization-run-massive)
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/64b93e6bd6c468ac7536607e/hOlFr-7E3oIZvNHHOuy-K.png)
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/64b93e6bd6c468ac7536607e/MfZPTnkvXp9elT_UE5DnE.png)
+- **Developed by:** [The Kaitchup](https://kaitchup.substack.com/)
+- **License:** Apache 2.0 license
+## How to Support My Work
+Subscribe to [The Kaitchup](https://kaitchup.substack.com/subscribe). This helps me a lot to continue quantizing and evaluating models for free.