Update README.md
Browse files
README.md
CHANGED
@@ -42,6 +42,9 @@ There have been reports of the quantized model misbehaving with the mistral prom
|
|
42 |
[Crimson_Dawn-Nitral-Special](https://files.catbox.moe/8xjxht.json) - Considered the best settings! <br/>
|
43 |
[Crimson_Dawn-Magnum-Style](https://files.catbox.moe/lc59dn.json)
|
44 |
|
|
|
|
|
|
|
45 |
## Training
|
46 |
Training was done twice over 2 epochs each on two 2x [NVIDIA A6000 GPUs](https://www.nvidia.com/en-us/design-visualization/rtx-a6000/) using LoRA. A two-phased approach was used in which the base model was trained 2 epochs on RP data, the LoRA was then applied to base. Finally, the new modified base was trained 2 epochs on instruct, and the new instruct LoRA was applied to the modified base, resulting in what you see here.
|
47 |
|
|
|
42 |
[Crimson_Dawn-Nitral-Special](https://files.catbox.moe/8xjxht.json) - Considered the best settings! <br/>
|
43 |
[Crimson_Dawn-Magnum-Style](https://files.catbox.moe/lc59dn.json)
|
44 |
|
45 |
+
### Tokenizer
|
46 |
+
If you are using SillyTavern, please set the tokenizer to API (WebUI/ koboldcpp)
|
47 |
+
|
48 |
## Training
|
49 |
Training was done twice over 2 epochs each on two 2x [NVIDIA A6000 GPUs](https://www.nvidia.com/en-us/design-visualization/rtx-a6000/) using LoRA. A two-phased approach was used in which the base model was trained 2 epochs on RP data, the LoRA was then applied to base. Finally, the new modified base was trained 2 epochs on instruct, and the new instruct LoRA was applied to the modified base, resulting in what you see here.
|
50 |
|