Text Generation
Transformers
Safetensors
English
llama
text-generation-inference
4-bit precision
gptq
TheBloke commited on
Commit
7038f7d
1 Parent(s): 0293d67

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -5
README.md CHANGED
@@ -1,6 +1,12 @@
1
  ---
2
  inference: false
3
- license: other
 
 
 
 
 
 
4
  ---
5
 
6
  <!-- header start -->
@@ -89,17 +95,17 @@ model = AutoGPTQForCausalLM.from_quantized(model_name_or_path,
89
  quantize_config=None)
90
 
91
  prompt = "Tell me about AI"
 
92
  prompt_template=f'''### System:
93
  You are an AI assistant that follows instruction extremely well. Help as much as you can.
94
 
95
  ### User:
96
- prompt
97
 
98
  ### Input:
99
- input, if required
100
 
101
  ### Response:
102
-
103
  '''
104
 
105
  print("\n\n*** Generate:")
@@ -139,7 +145,7 @@ It was created with group_size 128 to increase inference accuracy, but without -
139
 
140
  * `orca_mini_v2_13b-GPTQ-4bit-128g.no-act.order.safetensors`
141
  * Works with AutoGPTQ in CUDA or Triton modes.
142
- * [ExLlama](https://github.com/turboderp/exllama) suupports Llama 4-bit GPTQs, and will provide 2x speedup over AutoGPTQ and GPTQ-for-LLaMa.
143
  * Works with GPTQ-for-LLaMa in CUDA mode. May have issues with GPTQ-for-LLaMa Triton mode.
144
  * Works with text-generation-webui, including one-click-installers.
145
  * Parameters: Groupsize = 128. Act Order / desc_act = False.
 
1
  ---
2
  inference: false
3
+ license: cc-by-nc-sa-4.0
4
+ language:
5
+ - en
6
+ library_name: transformers
7
+ pipeline_tag: text-generation
8
+ datasets:
9
+ - psmathur/orca_minis_uncensored_dataset
10
  ---
11
 
12
  <!-- header start -->
 
95
  quantize_config=None)
96
 
97
  prompt = "Tell me about AI"
98
+ input = ""
99
  prompt_template=f'''### System:
100
  You are an AI assistant that follows instruction extremely well. Help as much as you can.
101
 
102
  ### User:
103
+ {prompt}
104
 
105
  ### Input:
106
+ {input}
107
 
108
  ### Response:
 
109
  '''
110
 
111
  print("\n\n*** Generate:")
 
145
 
146
  * `orca_mini_v2_13b-GPTQ-4bit-128g.no-act.order.safetensors`
147
  * Works with AutoGPTQ in CUDA or Triton modes.
148
+ * [ExLlama](https://github.com/turboderp/exllama) supports Llama 4-bit GPTQs, and will provide 2x speedup over AutoGPTQ and GPTQ-for-LLaMa.
149
  * Works with GPTQ-for-LLaMa in CUDA mode. May have issues with GPTQ-for-LLaMa Triton mode.
150
  * Works with text-generation-webui, including one-click-installers.
151
  * Parameters: Groupsize = 128. Act Order / desc_act = False.