lemonilia commited on
Commit
78c69b1
1 Parent(s): 4d76b14

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -5
README.md CHANGED
@@ -9,16 +9,20 @@ about 1800 training samples _up to_ 4k tokens length. A 2-pass training procedur
9
  finetuning on about 6800 stories within 4k tokens length and the second pass is LimaRP with changes introducing more effective
10
  control on response length.
11
 
12
- **Due to software limitations, finetuning didn't take advantage yet of the Sliding Window Attention (SWA) which would have allowed
13
- to use longer conversations in the training data. Thus, this version of LimaRP should be considered an _initial finetuning attempt_ and
14
- will be updated in the future.**
15
-
16
  For more details about LimaRP, see the model page for the [previously released v2 version for Llama-2](https://huggingface.co/lemonilia/limarp-llama2-v2).
17
  Most details written there apply for this version as well. Generally speaking, LimaRP is a longform-oriented, novel-style
18
  roleplaying chat model intended to replicate the experience of 1-on-1 roleplay on Internet forums. Short-form,
19
  IRC/Discord-style RP (aka "Markdown format") is not supported yet. The model does not include instruction tuning,
20
  only manually picked and slightly edited RP conversations with persona and scenario data.
21
 
 
 
 
 
 
 
 
 
22
  ## Prompt format
23
  Same as before. It uses the [extended Alpaca format](https://github.com/tatsu-lab/stanford_alpaca),
24
  with `### Input:` immediately preceding user inputs and `### Response:` immediately preceding
@@ -131,4 +135,4 @@ the base Mistral-7B-v0.1 model.
131
  For the second pass, the `lora_model_dir` option was used to continue finetuning on the LoRA
132
  adapter obtained from the first pass.
133
 
134
- Using 2 GPUs, the effective global batch size would have been 8.
 
9
  finetuning on about 6800 stories within 4k tokens length and the second pass is LimaRP with changes introducing more effective
10
  control on response length.
11
 
 
 
 
 
12
  For more details about LimaRP, see the model page for the [previously released v2 version for Llama-2](https://huggingface.co/lemonilia/limarp-llama2-v2).
13
  Most details written there apply for this version as well. Generally speaking, LimaRP is a longform-oriented, novel-style
14
  roleplaying chat model intended to replicate the experience of 1-on-1 roleplay on Internet forums. Short-form,
15
  IRC/Discord-style RP (aka "Markdown format") is not supported yet. The model does not include instruction tuning,
16
  only manually picked and slightly edited RP conversations with persona and scenario data.
17
 
18
+ ## Known issues
19
+ - Due to software limitations, finetuning didn't take advantage yet of the Sliding Window Attention (SWA) which would have allowed
20
+ to use longer conversations in the training data and a more accurate behavior with Scenario information. Thus, this version of LimaRP
21
+ should be considered preliminary and will be updated in the future.
22
+ - Despite performing a few finetuning attempts, including one that followed almost the same procedure as in previous releases,
23
+ Mistral-7B-v0.1 appears to have strange repetition issues.
24
+ - Even though benchmarks tell a different story, in practice the model doesn't feel smarter during roleplay than Llama-2-13B.
25
+
26
  ## Prompt format
27
  Same as before. It uses the [extended Alpaca format](https://github.com/tatsu-lab/stanford_alpaca),
28
  with `### Input:` immediately preceding user inputs and `### Response:` immediately preceding
 
135
  For the second pass, the `lora_model_dir` option was used to continue finetuning on the LoRA
136
  adapter obtained from the first pass.
137
 
138
+ Using 4 GPUs, the effective global batch size would have been 8.