lbourdois commited on
Commit
dcc78a2
·
verified ·
1 Parent(s): ce49329

Improve language tag

Browse files

Hi! As the model is multilingual, this is a PR to add other languages than English to the language tag to improve the referencing. Note that 29 languages are announced in the README, but only 13 are explicitly listed. I was therefore only able to add these 13 languages.

Files changed (1) hide show
  1. README.md +89 -78
README.md CHANGED
@@ -1,79 +1,90 @@
1
- ---
2
- base_model:
3
- - SicariusSicariiStuff/Impish_QWEN_14B-1M
4
- - deepseek-ai/DeepSeek-R1-Distill-Qwen-14B
5
- - Qwen/Qwen2.5-14B-Instruct-1M
6
- - Zhihu-ai/Zhi-writing-dsr1-14b
7
- - huihui-ai/Qwen2.5-14B-Instruct-1M-abliterated
8
- - huihui-ai/Qwen2.5-14B-Instruct-abliterated-v2
9
- - Qwen/Qwen2.5-14B-Instruct
10
- - mergekit-community/Qwen2.5-14B-della-code
11
- - tanliboy/lambda-qwen2.5-14b-dpo-test
12
- library_name: transformers
13
- tags:
14
- - mergekit
15
- - merge
16
- license: apache-2.0
17
- language:
18
- - en
19
- - zh
20
- pipeline_tag: text-generation
21
- ---
22
- *From the preliminary test results, the effect is really excellent!!!*
23
-
24
- This is definitely a very promising method for model merging!
25
-
26
- The current optimal formula ratio is: instruction: reasoning: code = 6:2:1
27
-
28
- # merge
29
-
30
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
31
-
32
- ## Merge Details
33
- ### Merge Method
34
-
35
- This model was merged using the [Karcher Mean](https://en.wikipedia.org/wiki/Karcher_mean) merge method using [Qwen/Qwen2.5-14B-Instruct](https://huggingface.co/Qwen/Qwen2.5-14B-Instruct) as a base.
36
-
37
- ### Models Merged
38
-
39
- The following models were included in the merge:
40
-
41
- **instruction:**(6)
42
- * [Qwen/Qwen2.5-14B-Instruct](https://huggingface.co/Qwen/Qwen2.5-14B-Instruct)
43
- * [Qwen/Qwen2.5-14B-Instruct-1M](https://huggingface.co/Qwen/Qwen2.5-14B-Instruct-1M)
44
- * [huihui-ai/Qwen2.5-14B-Instruct-1M-abliterated](https://huggingface.co/huihui-ai/Qwen2.5-14B-Instruct-1M-abliterated)
45
- * [huihui-ai/Qwen2.5-14B-Instruct-abliterated-v2](https://huggingface.co/huihui-ai/Qwen2.5-14B-Instruct-abliterated-v2)
46
- * [SicariusSicariiStuff/Impish_QWEN_14B-1M](https://huggingface.co/SicariusSicariiStuff/Impish_QWEN_14B-1M)
47
- * [tanliboy/lambda-qwen2.5-14b-dpo-test](https://huggingface.co/tanliboy/lambda-qwen2.5-14b-dpo-test)
48
-
49
- **reasoning:**(2)
50
- * [Zhihu-ai/Zhi-writing-dsr1-14b](https://huggingface.co/Zhihu-ai/Zhi-writing-dsr1-14b)
51
- * [deepseek-ai/DeepSeek-R1-Distill-Qwen-14B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B)
52
-
53
- **code:**(1)
54
- * [mergekit-community/Qwen2.5-14B-della-code](https://huggingface.co/mergekit-community/Qwen2.5-14B-della-code)
55
-
56
- ### Configuration
57
-
58
- The following YAML configuration was used to produce this model:
59
-
60
- ```yaml
61
- models:
62
- - model: mergekit-community/Qwen2.5-14B-della-code
63
- - model: Zhihu-ai/Zhi-writing-dsr1-14b
64
- - model: deepseek-ai/DeepSeek-R1-Distill-Qwen-14B
65
- - model: huihui-ai/Qwen2.5-14B-Instruct-1M-abliterated
66
- - model: huihui-ai/Qwen2.5-14B-Instruct-abliterated-v2
67
- - model: Qwen/Qwen2.5-14B-Instruct-1M
68
- - model: SicariusSicariiStuff/Impish_QWEN_14B-1M
69
- - model: tanliboy/lambda-qwen2.5-14b-dpo-test
70
- - model: Qwen/Qwen2.5-14B-Instruct
71
- merge_method: karcher
72
- base_model: Qwen/Qwen2.5-14B-Instruct
73
- parameters:
74
- max_iter: 1000
75
- normalize: true
76
- int8_mask: true
77
- tokenizer_source: base
78
- dtype: float16
 
 
 
 
 
 
 
 
 
 
 
79
  ```
 
1
+ ---
2
+ base_model:
3
+ - SicariusSicariiStuff/Impish_QWEN_14B-1M
4
+ - deepseek-ai/DeepSeek-R1-Distill-Qwen-14B
5
+ - Qwen/Qwen2.5-14B-Instruct-1M
6
+ - Zhihu-ai/Zhi-writing-dsr1-14b
7
+ - huihui-ai/Qwen2.5-14B-Instruct-1M-abliterated
8
+ - huihui-ai/Qwen2.5-14B-Instruct-abliterated-v2
9
+ - Qwen/Qwen2.5-14B-Instruct
10
+ - mergekit-community/Qwen2.5-14B-della-code
11
+ - tanliboy/lambda-qwen2.5-14b-dpo-test
12
+ library_name: transformers
13
+ tags:
14
+ - mergekit
15
+ - merge
16
+ license: apache-2.0
17
+ language:
18
+ - zho
19
+ - eng
20
+ - fra
21
+ - spa
22
+ - por
23
+ - deu
24
+ - ita
25
+ - rus
26
+ - jpn
27
+ - kor
28
+ - vie
29
+ - tha
30
+ - ara
31
+ pipeline_tag: text-generation
32
+ ---
33
+ *From the preliminary test results, the effect is really excellent!!!*
34
+
35
+ This is definitely a very promising method for model merging!
36
+
37
+ The current optimal formula ratio is: instruction: reasoning: code = 6:2:1
38
+
39
+ # merge
40
+
41
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
42
+
43
+ ## Merge Details
44
+ ### Merge Method
45
+
46
+ This model was merged using the [Karcher Mean](https://en.wikipedia.org/wiki/Karcher_mean) merge method using [Qwen/Qwen2.5-14B-Instruct](https://huggingface.co/Qwen/Qwen2.5-14B-Instruct) as a base.
47
+
48
+ ### Models Merged
49
+
50
+ The following models were included in the merge:
51
+
52
+ **instruction:**(6)
53
+ * [Qwen/Qwen2.5-14B-Instruct](https://huggingface.co/Qwen/Qwen2.5-14B-Instruct)
54
+ * [Qwen/Qwen2.5-14B-Instruct-1M](https://huggingface.co/Qwen/Qwen2.5-14B-Instruct-1M)
55
+ * [huihui-ai/Qwen2.5-14B-Instruct-1M-abliterated](https://huggingface.co/huihui-ai/Qwen2.5-14B-Instruct-1M-abliterated)
56
+ * [huihui-ai/Qwen2.5-14B-Instruct-abliterated-v2](https://huggingface.co/huihui-ai/Qwen2.5-14B-Instruct-abliterated-v2)
57
+ * [SicariusSicariiStuff/Impish_QWEN_14B-1M](https://huggingface.co/SicariusSicariiStuff/Impish_QWEN_14B-1M)
58
+ * [tanliboy/lambda-qwen2.5-14b-dpo-test](https://huggingface.co/tanliboy/lambda-qwen2.5-14b-dpo-test)
59
+
60
+ **reasoning:**(2)
61
+ * [Zhihu-ai/Zhi-writing-dsr1-14b](https://huggingface.co/Zhihu-ai/Zhi-writing-dsr1-14b)
62
+ * [deepseek-ai/DeepSeek-R1-Distill-Qwen-14B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B)
63
+
64
+ **code:**(1)
65
+ * [mergekit-community/Qwen2.5-14B-della-code](https://huggingface.co/mergekit-community/Qwen2.5-14B-della-code)
66
+
67
+ ### Configuration
68
+
69
+ The following YAML configuration was used to produce this model:
70
+
71
+ ```yaml
72
+ models:
73
+ - model: mergekit-community/Qwen2.5-14B-della-code
74
+ - model: Zhihu-ai/Zhi-writing-dsr1-14b
75
+ - model: deepseek-ai/DeepSeek-R1-Distill-Qwen-14B
76
+ - model: huihui-ai/Qwen2.5-14B-Instruct-1M-abliterated
77
+ - model: huihui-ai/Qwen2.5-14B-Instruct-abliterated-v2
78
+ - model: Qwen/Qwen2.5-14B-Instruct-1M
79
+ - model: SicariusSicariiStuff/Impish_QWEN_14B-1M
80
+ - model: tanliboy/lambda-qwen2.5-14b-dpo-test
81
+ - model: Qwen/Qwen2.5-14B-Instruct
82
+ merge_method: karcher
83
+ base_model: Qwen/Qwen2.5-14B-Instruct
84
+ parameters:
85
+ max_iter: 1000
86
+ normalize: true
87
+ int8_mask: true
88
+ tokenizer_source: base
89
+ dtype: float16
90
  ```