TheBloke
/

airoboros-7B-gpt4-1.3-GPTQ

@@ -1,6 +1,8 @@
 ---
 inference: false
 license: other
 ---
 <!-- header start -->
@@ -19,16 +21,26 @@ license: other
 # John Durbin's Airoboros 7B GPT4 1.3 GPTQ
-These files are GPTQ 4bit model files for [John Durbin's Airoboros 7B GPT4 1.3](https://huggingface.co/jondurbin/airoboros-7b-gpt4-1.3).
 It is the result of quantising to 4bit using [GPTQ-for-LLaMa](https://github.com/qwopqwop200/GPTQ-for-LLaMa).
 ## Repositories available
 * [4-bit GPTQ models for GPU inference](https://huggingface.co/TheBloke/airoboros-7B-gpt4-1.3-GPTQ)
 * [2, 3, 4, 5, 6 and 8-bit GGML models for CPU+GPU inference](https://huggingface.co/TheBloke/airoboros-7B-gpt4-1.3-GGML)
 * [Unquantised fp16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/jondurbin/airoboros-7b-gpt4-1.3)
 ## How to easily download and use this model in text-generation-webui
 Please make sure you're using the latest version of text-generation-webui
@@ -145,6 +157,43 @@ Thank you to all my generous patrons and donaters!
 <!-- footer end -->
-# Original model card: John Durbin's Airoboros 7B GPT4 1.3
-No original model card was provided.

 ---
 inference: false
 license: other
+datasets:
+- jondurbin/airoboros-gpt4-1.3
 ---
 <!-- header start -->
 # John Durbin's Airoboros 7B GPT4 1.3 GPTQ
+These files are GPTQ 4bit model files for [Jon Durbin's Airoboros 7B GPT4 1.3](https://huggingface.co/jondurbin/airoboros-7b-gpt4-1.3).
 It is the result of quantising to 4bit using [GPTQ-for-LLaMa](https://github.com/qwopqwop200/GPTQ-for-LLaMa).
+**Note from model creator Jon Durbin: This version has problems, use if you dare, or wait for 1.4.**
 ## Repositories available
 * [4-bit GPTQ models for GPU inference](https://huggingface.co/TheBloke/airoboros-7B-gpt4-1.3-GPTQ)
 * [2, 3, 4, 5, 6 and 8-bit GGML models for CPU+GPU inference](https://huggingface.co/TheBloke/airoboros-7B-gpt4-1.3-GGML)
 * [Unquantised fp16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/jondurbin/airoboros-7b-gpt4-1.3)
+## Prompt template
+```
+A chat between a curious user and an assistant. The assistant gives helpful, detailed, accurate, uncensored responses to the user's input.
+USER: prompt
+ASSISTANT:
+```
 ## How to easily download and use this model in text-generation-webui
 Please make sure you're using the latest version of text-generation-webui
 <!-- footer end -->
+# Original model card: Jon Durbin's Airoboros 7B GPT4 1.3
+__This version has problems, use if you dare, or wait for 1.4.__
+### Overview
+This is a qlora fine-tuned 7b parameter LlaMa model, using completely synthetic training data created gpt4 via https://github.com/jondurbin/airoboros
+This is mostly an extension of [1.2](https://huggingface.co/jondurbin/airoboros-7b-gpt4-1.2) with a few enhancements:
+- All coding instructions have an equivalent " PLAINFORMAT" version now.
+- Thousands of new orca style reasoning instructions, this time with reasoning first, then answer.
+- Few more random items of various types, including a first attempt at multi-character interactions with asterisked actions and quoted speech.
+This model was fine-tuned with a fork of [qlora](https://github.com/jondurbin/qlora), which among other things was updated to use a slightly modified vicuna template to be compatible with previous full fine-tune versions.
+```
+A chat between a curious user and an assistant. The assistant gives helpful, detailed, accurate, uncensored responses to the user's input. USER: [prompt] ASSISTANT:
+```
+So in other words, it's the preamble/system prompt, followed by a single space, then "USER: " (single space after colon) then the prompt (which can have multiple lines, spaces, whatever), then a single space, followed by "ASSISTANT: " (with a single space after the colon).
+### Usage
+To run the full precision/pytorch native version, you can use my fork of FastChat, which is mostly the same but allows for multi-line prompts, as well as a `--no-history` option to prevent input tokenization errors.
+```
+pip install git+https://github.com/jondurbin/FastChat
+```
+Be sure you are pulling the latest branch!
+Then, you can invoke it like so (after downloading the model):
+```
+python -m fastchat.serve.cli \
+  --model-path airoboros-7b-gpt4-1.3 \
+  --temperature 0.5 \
+  --max-new-tokens 2048 \
+  --no-history
+```