Update README.md
Browse files
README.md
CHANGED
@@ -1,10 +1,18 @@
|
|
1 |
---
|
2 |
-
base_model:
|
|
|
|
|
|
|
|
|
3 |
library_name: transformers
|
4 |
tags:
|
5 |
- mergekit
|
6 |
- merge
|
7 |
-
|
|
|
|
|
|
|
|
|
8 |
---
|
9 |
# merge2
|
10 |
|
@@ -13,14 +21,14 @@ This is a merge of pre-trained language models created using [mergekit](https://
|
|
13 |
## Merge Details
|
14 |
### Merge Method
|
15 |
|
16 |
-
This model was merged using the [SCE](https://arxiv.org/abs/2408.07990) merge method using /
|
17 |
|
18 |
### Models Merged
|
19 |
|
20 |
The following models were included in the merge:
|
21 |
-
* /
|
22 |
-
* /
|
23 |
-
* /
|
24 |
|
25 |
### Configuration
|
26 |
|
@@ -30,16 +38,16 @@ The following YAML configuration was used to produce this model:
|
|
30 |
|
31 |
models:
|
32 |
# Pivot model
|
33 |
-
- model:
|
34 |
# Target models
|
35 |
-
- model: /
|
36 |
-
- model: /
|
37 |
-
- model: /
|
38 |
merge_method: sce
|
39 |
-
base_model:
|
40 |
parameters:
|
41 |
select_topk: 0.65
|
42 |
int8_mask: true
|
43 |
dtype: bfloat16
|
44 |
|
45 |
-
```
|
|
|
1 |
---
|
2 |
+
base_model:
|
3 |
+
- meta-llama/Meta-Llama-3-8B
|
4 |
+
- shisa-ai/shisa-v1-llama3-8b
|
5 |
+
- rinna/llama-3-youko-8b
|
6 |
+
- tokyotech-llm/Llama-3-Swallow-8B-v0.1
|
7 |
library_name: transformers
|
8 |
tags:
|
9 |
- mergekit
|
10 |
- merge
|
11 |
+
license: llama3
|
12 |
+
language:
|
13 |
+
- ja
|
14 |
+
- en
|
15 |
+
pipeline_tag: text-generation
|
16 |
---
|
17 |
# merge2
|
18 |
|
|
|
21 |
## Merge Details
|
22 |
### Merge Method
|
23 |
|
24 |
+
This model was merged using the [SCE](https://arxiv.org/abs/2408.07990) merge method using [meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) as a base.
|
25 |
|
26 |
### Models Merged
|
27 |
|
28 |
The following models were included in the merge:
|
29 |
+
* [shisa-ai/shisa-v1-llama3-8b](https://huggingface.co/shisa-ai/shisa-v1-llama3-8b)
|
30 |
+
* [rinna/llama-3-youko-8b](https://huggingface.co/rinna/llama-3-youko-8b)
|
31 |
+
* [tokyotech-llm/Llama-3-Swallow-8B-v0.1](https://huggingface.co/tokyotech-llm/Llama-3-Swallow-8B-v0.1)
|
32 |
|
33 |
### Configuration
|
34 |
|
|
|
38 |
|
39 |
models:
|
40 |
# Pivot model
|
41 |
+
- model: meta-llama/Meta-Llama-3-8B
|
42 |
# Target models
|
43 |
+
- model: shisa-ai/shisa-v1-llama3-8b
|
44 |
+
- model: rinna/llama-3-youko-8b
|
45 |
+
- model: tokyotech-llm/Llama-3-Swallow-8B-v0.1
|
46 |
merge_method: sce
|
47 |
+
base_model: meta-llama/Meta-Llama-3-8B
|
48 |
parameters:
|
49 |
select_topk: 0.65
|
50 |
int8_mask: true
|
51 |
dtype: bfloat16
|
52 |
|
53 |
+
```
|