Darkhn commited on
Commit
7cd5e7f
·
verified ·
1 Parent(s): 646ecc7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +92 -96
README.md CHANGED
@@ -1,96 +1,92 @@
1
- ---
2
- base_model:
3
- - TareksLab/M-MERGE1
4
- - TareksLab/M-BASE-SCE
5
- - TareksLab/M-MERGE3
6
- - TareksLab/M-MERGE2
7
- - TareksLab/M-MERGE4
8
- library_name: transformers
9
- tags:
10
- - mergekit
11
- - merge
12
-
13
- ---
14
- # merge
15
-
16
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
17
-
18
- ## Merge Details
19
- ### Merge Method
20
-
21
- This model was merged using the [DARE TIES](https://arxiv.org/abs/2311.03099) merge method using [TareksLab/M-BASE-SCE](https://huggingface.co/TareksLab/M-BASE-SCE) as a base.
22
-
23
- ### Models Merged
24
-
25
- The following models were included in the merge:
26
- * [TareksLab/M-MERGE1](https://huggingface.co/TareksLab/M-MERGE1)
27
- * [TareksLab/M-MERGE3](https://huggingface.co/TareksLab/M-MERGE3)
28
- * [TareksLab/M-MERGE2](https://huggingface.co/TareksLab/M-MERGE2)
29
- * [TareksLab/M-MERGE4](https://huggingface.co/TareksLab/M-MERGE4)
30
-
31
- ### Configuration
32
-
33
- The following YAML configuration was used to produce this model:
34
-
35
- ```yaml
36
- models:
37
- - model: TareksLab/M-MERGE4
38
- parameters:
39
- weight:
40
- - filter: self_attn
41
- value: [0.3, 0.1, 0.2]
42
- - filter: mlp
43
- value: [0.4, 0.2, 0.1]
44
- - value: 0.2
45
- density: 0.7
46
- lambda: 1.05
47
- - model: TareksLab/M-MERGE3
48
- parameters:
49
- weight:
50
- - filter: self_attn
51
- value: [0.2, 0.1, 0.3]
52
- - filter: mlp
53
- value: [0.3, 0.1, 0.2]
54
- - value: 0.2
55
- density: 0.65
56
- lambda: 1.05
57
- - model: TareksLab/M-MERGE2
58
- parameters:
59
- weight:
60
- - filter: self_attn
61
- value: [0.1, 0.3, 0.1]
62
- - filter: mlp
63
- value: [0.2, 0.3, 0.1]
64
- - value: 0.2
65
- density: 0.6
66
- lambda: 1.05
67
- - model: TareksLab/M-MERGE1
68
- parameters:
69
- weight:
70
- - filter: self_attn
71
- value: [0.2, 0.2, 0.1]
72
- - filter: mlp
73
- value: [0.1, 0.2, 0.2]
74
- - value: 0.2
75
- density: 0.6
76
- lambda: 1
77
- - model: TareksLab/M-BASE-SCE
78
- parameters:
79
- weight:
80
- - filter: self_attn
81
- value: [0.1, 0.3, 0.3]
82
- - filter: mlp
83
- value: [0.1, 0.2, 0.4]
84
- - value: 0.2
85
- density: 0.55
86
- lambda: 1
87
- base_model: TareksLab/M-BASE-SCE
88
- merge_method: dare_ties
89
- parameters:
90
- normalize: false
91
- pad_to_multiple_of: 4
92
- tokenizer:
93
- source: TareksLab/M-TOKENIZER-SCE
94
- chat_template: llama3
95
- dtype: bfloat16
96
- ```
 
1
+ ---
2
+ base_model_relation: quantized
3
+ library_name: transformers
4
+ tags:
5
+ - mergekit
6
+ - merge
7
+ base_model:
8
+ - TareksTesting/Legion-V1A-LLaMa-70B
9
+ ---
10
+ # merge
11
+
12
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
13
+
14
+ ## Merge Details
15
+ ### Merge Method
16
+
17
+ This model was merged using the [DARE TIES](https://arxiv.org/abs/2311.03099) merge method using [TareksLab/M-BASE-SCE](https://huggingface.co/TareksLab/M-BASE-SCE) as a base.
18
+
19
+ ### Models Merged
20
+
21
+ The following models were included in the merge:
22
+ * [TareksLab/M-MERGE1](https://huggingface.co/TareksLab/M-MERGE1)
23
+ * [TareksLab/M-MERGE3](https://huggingface.co/TareksLab/M-MERGE3)
24
+ * [TareksLab/M-MERGE2](https://huggingface.co/TareksLab/M-MERGE2)
25
+ * [TareksLab/M-MERGE4](https://huggingface.co/TareksLab/M-MERGE4)
26
+
27
+ ### Configuration
28
+
29
+ The following YAML configuration was used to produce this model:
30
+
31
+ ```yaml
32
+ models:
33
+ - model: TareksLab/M-MERGE4
34
+ parameters:
35
+ weight:
36
+ - filter: self_attn
37
+ value: [0.3, 0.1, 0.2]
38
+ - filter: mlp
39
+ value: [0.4, 0.2, 0.1]
40
+ - value: 0.2
41
+ density: 0.7
42
+ lambda: 1.05
43
+ - model: TareksLab/M-MERGE3
44
+ parameters:
45
+ weight:
46
+ - filter: self_attn
47
+ value: [0.2, 0.1, 0.3]
48
+ - filter: mlp
49
+ value: [0.3, 0.1, 0.2]
50
+ - value: 0.2
51
+ density: 0.65
52
+ lambda: 1.05
53
+ - model: TareksLab/M-MERGE2
54
+ parameters:
55
+ weight:
56
+ - filter: self_attn
57
+ value: [0.1, 0.3, 0.1]
58
+ - filter: mlp
59
+ value: [0.2, 0.3, 0.1]
60
+ - value: 0.2
61
+ density: 0.6
62
+ lambda: 1.05
63
+ - model: TareksLab/M-MERGE1
64
+ parameters:
65
+ weight:
66
+ - filter: self_attn
67
+ value: [0.2, 0.2, 0.1]
68
+ - filter: mlp
69
+ value: [0.1, 0.2, 0.2]
70
+ - value: 0.2
71
+ density: 0.6
72
+ lambda: 1
73
+ - model: TareksLab/M-BASE-SCE
74
+ parameters:
75
+ weight:
76
+ - filter: self_attn
77
+ value: [0.1, 0.3, 0.3]
78
+ - filter: mlp
79
+ value: [0.1, 0.2, 0.4]
80
+ - value: 0.2
81
+ density: 0.55
82
+ lambda: 1
83
+ base_model: TareksLab/M-BASE-SCE
84
+ merge_method: dare_ties
85
+ parameters:
86
+ normalize: false
87
+ pad_to_multiple_of: 4
88
+ tokenizer:
89
+ source: TareksLab/M-TOKENIZER-SCE
90
+ chat_template: llama3
91
+ dtype: bfloat16
92
+ ```