Lambent commited on
Commit
14d84d2
1 Parent(s): fc32b83

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -39,7 +39,7 @@ This is a merge of pre-trained language models created using [mergekit](https://
39
  From there, each of the four threads was separately task-tuned on 2 datasets each.
40
  Various methods of combining those via merge were tested, with this one scoring highest on EQ-Bench as an indicator.
41
 
42
- My understanding of the Model Stock merge method is that it mitigates task adaptation to a significant degree, but also significantly limits forgetting caused by training.
43
  I have hope that the adaptation, especially over two stages, is still sufficient to aid in longer contexts and multi-turn conversations from the ancestor models, and add some individual style while retaining a fair amount of their capability.
44
 
45
  This model's refusals are ... not nonexistent, but certainly don't rely on them.
 
39
  From there, each of the four threads was separately task-tuned on 2 datasets each.
40
  Various methods of combining those via merge were tested, with this one scoring highest on EQ-Bench as an indicator.
41
 
42
+ My understanding of the Model Stock merge method is that it reduces task adaptation to a significant degree, but also significantly limits forgetting caused by training.
43
  I have hope that the adaptation, especially over two stages, is still sufficient to aid in longer contexts and multi-turn conversations from the ancestor models, and add some individual style while retaining a fair amount of their capability.
44
 
45
  This model's refusals are ... not nonexistent, but certainly don't rely on them.