spedrox-sac lbourdois commited on
Commit
f94f978
·
verified ·
1 Parent(s): 60c2bf3

Improve language tag (#1)

Browse files

- Improve language tag (fc5f45ac979ef77685bfe1f1325b4815b8a19a76)


Co-authored-by: Loïck BOURDOIS <[email protected]>

Files changed (1) hide show
  1. README.md +61 -49
README.md CHANGED
@@ -1,50 +1,62 @@
1
- ---
2
- license: mit
3
- datasets:
4
- - fka/awesome-chatgpt-prompts
5
- language:
6
- - en
7
- base_model:
8
- - Qwen/Qwen2.5-1.5B-Instruct
9
- pipeline_tag: text-generation
10
- ---
11
- # Quantized Qwen2.5-1.5B-Instruct
12
-
13
- This repository contains 8-bit and 4-bit quantized versions of the Qwen2.5-1.5B-Instruct model using GPTQ. Quantization significantly reduces the model's size and memory footprint, enabling faster inference on resource-constrained devices while maintaining reasonable performance.
14
-
15
-
16
- ## Model Description
17
-
18
- The Qwen2.5-1.5B-Instruct is a powerful language model developed by Qwen for instructional tasks. These quantized versions offer a more efficient way to deploy and utilize this model.
19
-
20
-
21
- ## Quantization Details
22
-
23
- * **Quantization Method:** GPTQ (Generative Pretrained Transformer Quantization)
24
- * **Quantization Bits:** 8-bit and 4-bit versions available.
25
- * **Dataset:** The model was quantized using a subset of the "fka/awesome-chatgpt-prompts" dataset.
26
-
27
-
28
- ## Usage
29
-
30
- To use the quantized models, follow these steps:
31
-
32
- **Install Dependencies:**
33
- ```bash
34
- pip install transformers accelerate bitsandbytes auto-gptq optimum
35
- ```
36
- ## Performance
37
-
38
- The quantized models offer a significant reduction in size and memory usage compared to the original model. While there might be a slight decrease in performance, the trade-off is often beneficial for deployment on devices with limited resources.
39
-
40
-
41
- ## Disclaimer
42
-
43
- These quantized models are provided for research and experimentation purposes. We do not guarantee their performance or suitability for specific applications.
44
-
45
-
46
- ## Acknowledgements
47
-
48
- * **Qwen:** For developing the original Qwen2.5-1.5B-Instruct model.
49
- * **Hugging Face:** For providing the platform and tools for model sharing and quantization.
 
 
 
 
 
 
 
 
 
 
 
 
50
  * **GPTQ Authors:** For developing the GPTQ quantization method.
 
1
+ ---
2
+ license: mit
3
+ datasets:
4
+ - fka/awesome-chatgpt-prompts
5
+ language:
6
+ - zho
7
+ - eng
8
+ - fra
9
+ - spa
10
+ - por
11
+ - deu
12
+ - ita
13
+ - rus
14
+ - jpn
15
+ - kor
16
+ - vie
17
+ - tha
18
+ - ara
19
+ base_model:
20
+ - Qwen/Qwen2.5-1.5B-Instruct
21
+ pipeline_tag: text-generation
22
+ ---
23
+ # Quantized Qwen2.5-1.5B-Instruct
24
+
25
+ This repository contains 8-bit and 4-bit quantized versions of the Qwen2.5-1.5B-Instruct model using GPTQ. Quantization significantly reduces the model's size and memory footprint, enabling faster inference on resource-constrained devices while maintaining reasonable performance.
26
+
27
+
28
+ ## Model Description
29
+
30
+ The Qwen2.5-1.5B-Instruct is a powerful language model developed by Qwen for instructional tasks. These quantized versions offer a more efficient way to deploy and utilize this model.
31
+
32
+
33
+ ## Quantization Details
34
+
35
+ * **Quantization Method:** GPTQ (Generative Pretrained Transformer Quantization)
36
+ * **Quantization Bits:** 8-bit and 4-bit versions available.
37
+ * **Dataset:** The model was quantized using a subset of the "fka/awesome-chatgpt-prompts" dataset.
38
+
39
+
40
+ ## Usage
41
+
42
+ To use the quantized models, follow these steps:
43
+
44
+ **Install Dependencies:**
45
+ ```bash
46
+ pip install transformers accelerate bitsandbytes auto-gptq optimum
47
+ ```
48
+ ## Performance
49
+
50
+ The quantized models offer a significant reduction in size and memory usage compared to the original model. While there might be a slight decrease in performance, the trade-off is often beneficial for deployment on devices with limited resources.
51
+
52
+
53
+ ## Disclaimer
54
+
55
+ These quantized models are provided for research and experimentation purposes. We do not guarantee their performance or suitability for specific applications.
56
+
57
+
58
+ ## Acknowledgements
59
+
60
+ * **Qwen:** For developing the original Qwen2.5-1.5B-Instruct model.
61
+ * **Hugging Face:** For providing the platform and tools for model sharing and quantization.
62
  * **GPTQ Authors:** For developing the GPTQ quantization method.