Re-upload of GPTQ model due to issue with base model
Browse files
README.md
CHANGED
@@ -21,7 +21,7 @@ license: other
|
|
21 |
|
22 |
These files are GPTQ 4bit model files for [Camel AI's CAMEL 13B Role Playing Data](https://huggingface.co/camel-ai/CAMEL-13B-Role-Playing-Data).
|
23 |
|
24 |
-
It is the result of quantising to 4bit using [
|
25 |
|
26 |
## Repositories available
|
27 |
|
@@ -36,10 +36,13 @@ Please make sure you're using the latest version of text-generation-webui
|
|
36 |
1. Click the **Model tab**.
|
37 |
2. Under **Download custom model or LoRA**, enter `TheBloke/CAMEL-13B-Role-Playing-Data-GPTQ`.
|
38 |
3. Click **Download**.
|
39 |
-
4. The model will start downloading
|
40 |
-
5.
|
|
|
|
|
|
|
41 |
* Note that you do not need to set GPTQ parameters any more. These are set automatically from the file `quantize_config.json`.
|
42 |
-
|
43 |
|
44 |
## How to use this GPTQ model from Python code
|
45 |
|
@@ -55,7 +58,7 @@ from auto_gptq import AutoGPTQForCausalLM, BaseQuantizeConfig
|
|
55 |
import argparse
|
56 |
|
57 |
model_name_or_path = "TheBloke/CAMEL-13B-Role-Playing-Data-GPTQ"
|
58 |
-
model_basename = "camel-13b-roleplay-GPTQ-4bit
|
59 |
|
60 |
use_triton = False
|
61 |
|
@@ -64,11 +67,15 @@ tokenizer = AutoTokenizer.from_pretrained(model_name_or_path, use_fast=True)
|
|
64 |
model = AutoGPTQForCausalLM.from_quantized(model_name_or_path,
|
65 |
model_basename=model_basename,
|
66 |
use_safetensors=True,
|
67 |
-
trust_remote_code=
|
68 |
device="cuda:0",
|
69 |
use_triton=use_triton,
|
70 |
quantize_config=None)
|
71 |
|
|
|
|
|
|
|
|
|
72 |
print("\n\n*** Generate:")
|
73 |
|
74 |
input_ids = tokenizer(prompt_template, return_tensors='pt').input_ids.cuda()
|
@@ -80,10 +87,6 @@ print(tokenizer.decode(output[0]))
|
|
80 |
# Prevent printing spurious transformers error when using pipeline with AutoGPTQ
|
81 |
logging.set_verbosity(logging.CRITICAL)
|
82 |
|
83 |
-
prompt = "Tell me about AI"
|
84 |
-
prompt_template=f'''### Human: {prompt}
|
85 |
-
### Assistant:'''
|
86 |
-
|
87 |
print("*** Pipeline:")
|
88 |
pipe = pipeline(
|
89 |
"text-generation",
|
@@ -100,17 +103,17 @@ print(pipe(prompt_template)[0]['generated_text'])
|
|
100 |
|
101 |
## Provided files
|
102 |
|
103 |
-
**camel-13b-roleplay-GPTQ-4bit
|
104 |
|
105 |
This will work with AutoGPTQ and CUDA versions of GPTQ-for-LLaMa. There are reports of issues with Triton mode of recent GPTQ-for-LLaMa. If you have issues, please use AutoGPTQ instead.
|
106 |
|
107 |
-
It was created
|
108 |
|
109 |
-
* `camel-13b-roleplay-GPTQ-4bit
|
110 |
* Works with AutoGPTQ in CUDA or Triton modes.
|
111 |
* Works with GPTQ-for-LLaMa in CUDA mode. May have issues with GPTQ-for-LLaMa Triton mode.
|
112 |
* Works with text-generation-webui, including one-click-installers.
|
113 |
-
* Parameters: Groupsize =
|
114 |
|
115 |
<!-- footer start -->
|
116 |
## Discord
|
@@ -134,7 +137,7 @@ Donaters will get priority support on any and all AI/LLM/model questions and req
|
|
134 |
|
135 |
**Special thanks to**: Luke from CarbonQuill, Aemon Algiz, Dmitriy Samsonov.
|
136 |
|
137 |
-
**Patreon special mentions**:
|
138 |
|
139 |
Thank you to all my generous patrons and donaters!
|
140 |
|
|
|
21 |
|
22 |
These files are GPTQ 4bit model files for [Camel AI's CAMEL 13B Role Playing Data](https://huggingface.co/camel-ai/CAMEL-13B-Role-Playing-Data).
|
23 |
|
24 |
+
It is the result of quantising to 4bit using [AutoGPTQ](https://github.com/PanQiWei/AutoGPTQ).
|
25 |
|
26 |
## Repositories available
|
27 |
|
|
|
36 |
1. Click the **Model tab**.
|
37 |
2. Under **Download custom model or LoRA**, enter `TheBloke/CAMEL-13B-Role-Playing-Data-GPTQ`.
|
38 |
3. Click **Download**.
|
39 |
+
4. The model will start downloading. Once it's finished it will say "Done"
|
40 |
+
5. In the top left, click the refresh icon next to **Model**.
|
41 |
+
6. In the **Model** dropdown, choose the model you just downloaded: `CAMEL-13B-Role-Playing-Data-GPTQ`
|
42 |
+
7. The model will automatically load, and is now ready for use!
|
43 |
+
8. If you want any custom settings, set them and then click **Save settings for this model** followed by **Reload the Model** in the top right.
|
44 |
* Note that you do not need to set GPTQ parameters any more. These are set automatically from the file `quantize_config.json`.
|
45 |
+
9. Once you're ready, click the **Text Generation tab** and enter a prompt to get started!
|
46 |
|
47 |
## How to use this GPTQ model from Python code
|
48 |
|
|
|
58 |
import argparse
|
59 |
|
60 |
model_name_or_path = "TheBloke/CAMEL-13B-Role-Playing-Data-GPTQ"
|
61 |
+
model_basename = "camel-13b-roleplay-GPTQ-4bit-128g.no-act.order"
|
62 |
|
63 |
use_triton = False
|
64 |
|
|
|
67 |
model = AutoGPTQForCausalLM.from_quantized(model_name_or_path,
|
68 |
model_basename=model_basename,
|
69 |
use_safetensors=True,
|
70 |
+
trust_remote_code=False,
|
71 |
device="cuda:0",
|
72 |
use_triton=use_triton,
|
73 |
quantize_config=None)
|
74 |
|
75 |
+
prompt = "Tell me about AI"
|
76 |
+
prompt_template=f'''### Human: {prompt}
|
77 |
+
### Assistant:'''
|
78 |
+
|
79 |
print("\n\n*** Generate:")
|
80 |
|
81 |
input_ids = tokenizer(prompt_template, return_tensors='pt').input_ids.cuda()
|
|
|
87 |
# Prevent printing spurious transformers error when using pipeline with AutoGPTQ
|
88 |
logging.set_verbosity(logging.CRITICAL)
|
89 |
|
|
|
|
|
|
|
|
|
90 |
print("*** Pipeline:")
|
91 |
pipe = pipeline(
|
92 |
"text-generation",
|
|
|
103 |
|
104 |
## Provided files
|
105 |
|
106 |
+
**camel-13b-roleplay-GPTQ-4bit-128g.no-act.order.safetensors**
|
107 |
|
108 |
This will work with AutoGPTQ and CUDA versions of GPTQ-for-LLaMa. There are reports of issues with Triton mode of recent GPTQ-for-LLaMa. If you have issues, please use AutoGPTQ instead.
|
109 |
|
110 |
+
It was created with group_size 128 to increase inference accuracy, but without --act-order (desc_act) to increase compatibility and improve inference speed.
|
111 |
|
112 |
+
* `camel-13b-roleplay-GPTQ-4bit-128g.no-act.order.safetensors`
|
113 |
* Works with AutoGPTQ in CUDA or Triton modes.
|
114 |
* Works with GPTQ-for-LLaMa in CUDA mode. May have issues with GPTQ-for-LLaMa Triton mode.
|
115 |
* Works with text-generation-webui, including one-click-installers.
|
116 |
+
* Parameters: Groupsize = 128. Act Order / desc_act = False.
|
117 |
|
118 |
<!-- footer start -->
|
119 |
## Discord
|
|
|
137 |
|
138 |
**Special thanks to**: Luke from CarbonQuill, Aemon Algiz, Dmitriy Samsonov.
|
139 |
|
140 |
+
**Patreon special mentions**: Oscar Rangel, Eugene Pentland, Talal Aujan, Cory Kujawski, Luke, Asp the Wyvern, Ai Maven, Pyrater, Alps Aficionado, senxiiz, Willem Michiel, Junyu Yang, trip7s trip, Sebastain Graf, Joseph William Delisle, Lone Striker, Jonathan Leane, Johann-Peter Hartmann, David Flickinger, Spiking Neurons AB, Kevin Schuppel, Mano Prime, Dmitriy Samsonov, Sean Connelly, Nathan LeClaire, Alain Rossmann, Fen Risland, Derek Yates, Luke Pendergrass, Nikolai Manek, Khalefa Al-Ahmad, Artur Olbinski, John Detwiler, Ajan Kanaga, Imad Khwaja, Trenton Dambrowitz, Kalila, vamX, webtim, Illia Dulskyi.
|
141 |
|
142 |
Thank you to all my generous patrons and donaters!
|
143 |
|