cgus
/

Qwen2.5-1.5B-Instruct-exl2

Text Generation

4-bit precision

Model card Files Files and versions Community

cgus commited on Dec 4, 2024

Commit

2fa8cfe

·

verified ·

1 Parent(s): 2666862

Update README.md

Files changed (1) hide show

README.md +19 -2

README.md CHANGED Viewed

@@ -4,11 +4,28 @@ license_link: https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct/blob/main/LICENS
 language:
 - en
 pipeline_tag: text-generation
-base_model: Qwen/Qwen2.5-1.5B
 tags:
 - chat
-library_name: transformers
 ---
 # Qwen2.5-1.5B-Instruct

 language:
 - en
 pipeline_tag: text-generation
+base_model: Qwen/Qwen2.5-1.5B-Instruct
 tags:
 - chat
+library_name: Exllamav2
 ---
+# Qwen2.5-1.5B-Instruct-exl2
+Model: [Qwen2.5-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct)
+Creator: [Qwen](https://huggingface.co/Qwen)
+## Quants
+[4bpw h6 (main)](https://huggingface.co/cgus/Qwen2.5-1.5B-Instruct-exl2/tree/main)
+[4.5bpw h6](https://huggingface.co/cgus/Qwen2.5-1.5B-Instruct-exl2/tree/4.5bpw-h6)
+[5bpw h6](https://huggingface.co/cgus/Qwen2.5-1.5B-Instruct-exl2/tree/5bpw-h6)
+[5.5bpw h6](https://huggingface.co/cgus/Qwen2.5-1.5B-Instruct-exl2/tree/5.5bpw-h6)
+[6bpw h6](https://huggingface.co/cgus/Qwen2.5-1.5B-Instruct-exl2/tree/6bpw-h6)
+[6.5bpw h8](https://huggingface.co/cgus/Qwen2.5-1.5B-Instruct-exl2/tree/6.5bpw-h8)
+[8bpw h8](https://huggingface.co/cgus/Qwen2.5-1.5B-Instruct-exl2/tree/8bpw-h8)
+## Quantization notes
+Made with Exllamav2 0.2.4 with the default dataset.
+These quants might be useful as a draft model for 32B/70B model with TabbyAPI.
+Exllamav2 supports Nvidia RTX GPUs on Windows and Nvidia RTX and ROCm AMD cards on Linux.
 # Qwen2.5-1.5B-Instruct