Text Generation
Safetensors
mistral
conversational
Epiculous commited on
Commit
cacfaa0
1 Parent(s): 8982819

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -92
README.md CHANGED
@@ -25,104 +25,19 @@ Back from the dead! Hoping to make something cool to share with everyone! Introd
25
  [exl2](https://huggingface.co/lucyknada/Epiculous_Crimson_Dawn-V0.1-exl2) / [gguf](https://huggingface.co/mradermacher/Crimson_Dawn-V0.1-GGUF)
26
 
27
  ## Prompting
28
- Crimson Dawn was trained with the Mistral Instruct template, therefore it should be prompted in the same way that you would prompt any other mistral model.
29
 
30
  ```
31
  "<s>[INST] Prompt goes here [/INST]<\s>"
32
  ```
 
 
 
 
33
 
34
  ### Current Top Sampler Settings
35
- ```json
36
- {
37
- "temp": 1.25,
38
- "temperature_last": true,
39
- "top_p": 1,
40
- "top_k": -1,
41
- "top_a": 0,
42
- "tfs": 1,
43
- "epsilon_cutoff": 0,
44
- "eta_cutoff": 0,
45
- "typical_p": 1,
46
- "min_p": 0.3,
47
- "rep_pen": 1,
48
- "rep_pen_range": 0,
49
- "rep_pen_decay": 0,
50
- "rep_pen_slope": 1,
51
- "no_repeat_ngram_size": 0,
52
- "penalty_alpha": 0,
53
- "num_beams": 1,
54
- "length_penalty": 1,
55
- "min_length": 0,
56
- "encoder_rep_pen": 1,
57
- "freq_pen": 0,
58
- "presence_pen": 0,
59
- "skew": 0,
60
- "do_sample": true,
61
- "early_stopping": false,
62
- "dynatemp": false,
63
- "min_temp": 0,
64
- "max_temp": 2,
65
- "dynatemp_exponent": 1,
66
- "smoothing_factor": 0,
67
- "smoothing_curve": 1,
68
- "dry_allowed_length": 2,
69
- "dry_multiplier": 0,
70
- "dry_base": 1.75,
71
- "dry_sequence_breakers": "[\"\\n\", \":\", \"\\\"\", \"*\"]",
72
- "dry_penalty_last_n": 0,
73
- "add_bos_token": true,
74
- "ban_eos_token": false,
75
- "skip_special_tokens": true,
76
- "mirostat_mode": 0,
77
- "mirostat_tau": 5,
78
- "mirostat_eta": 0.1,
79
- "guidance_scale": 1,
80
- "negative_prompt": "",
81
- "grammar_string": "",
82
- "json_schema": {},
83
- "banned_tokens": "",
84
- "sampler_priority": [
85
- "temperature",
86
- "dynamic_temperature",
87
- "quadratic_sampling",
88
- "top_k",
89
- "top_p",
90
- "typical_p",
91
- "epsilon_cutoff",
92
- "eta_cutoff",
93
- "tfs",
94
- "top_a",
95
- "min_p",
96
- "mirostat"
97
- ],
98
- "samplers": [
99
- "top_k",
100
- "tfs_z",
101
- "typical_p",
102
- "top_p",
103
- "min_p",
104
- "temperature"
105
- ],
106
- "ignore_eos_token": false,
107
- "spaces_between_special_tokens": true,
108
- "speculative_ngram": false,
109
- "sampler_order": [
110
- 5,
111
- 6,
112
- 0,
113
- 1,
114
- 2,
115
- 3,
116
- 4
117
- ],
118
- "logit_bias": [],
119
- "ignore_eos_token_aphrodite": false,
120
- "spaces_between_special_tokens_aphrodite": true,
121
- "rep_pen_size": 0,
122
- "genamt": 1024,
123
- "max_length": 16384
124
- }
125
- ```
126
 
127
  ## Training
128
  Training was done twice over 2 epochs each on two 2x [NVIDIA A6000 GPUs](https://www.nvidia.com/en-us/design-visualization/rtx-a6000/) using LoRA. A two-phased approach was used in which the base model was trained 2 epochs on RP data, the LoRA was then applied to base. Finally, the new modified base was trained 2 epochs on instruct, and the new instruct LoRA was applied to the modified base, resulting in what you see here.
 
25
  [exl2](https://huggingface.co/lucyknada/Epiculous_Crimson_Dawn-V0.1-exl2) / [gguf](https://huggingface.co/mradermacher/Crimson_Dawn-V0.1-GGUF)
26
 
27
  ## Prompting
28
+ Crimson Dawn was trained with the Mistral Instruct template, therefore it should be prompted in a similar way that you would prompt any other mistral based model.
29
 
30
  ```
31
  "<s>[INST] Prompt goes here [/INST]<\s>"
32
  ```
33
+ ### Context and Instruct
34
+ [Mistral-Custom-Context.json](https://files.catbox.moe/l9w0ry.json)
35
+ [Mistral-Custom-Instruct.json](https://files.catbox.moe/9xiiwb.json)
36
+
37
 
38
  ### Current Top Sampler Settings
39
+ [Crimson_Dawn-Magnum-Style](https://files.catbox.moe/lc59dn.json)
40
+ [Crimson_Dawn-Nitral-Special](https://files.catbox.moe/8xjxht.json)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
41
 
42
  ## Training
43
  Training was done twice over 2 epochs each on two 2x [NVIDIA A6000 GPUs](https://www.nvidia.com/en-us/design-visualization/rtx-a6000/) using LoRA. A two-phased approach was used in which the base model was trained 2 epochs on RP data, the LoRA was then applied to base. Finally, the new modified base was trained 2 epochs on instruct, and the new instruct LoRA was applied to the modified base, resulting in what you see here.