lbourdois commited on
Commit
6cf4fee
·
verified ·
1 Parent(s): d89d46c

Improve language tag

Browse files

Hi! As the model is multilingual, this is a PR to add other languages than English to the language tag to improve the referencing. Note that 29 languages are announced in the README, but only 13 are explicitly listed. I was therefore only able to add these 13 languages.

Files changed (1) hide show
  1. README.md +61 -48
README.md CHANGED
@@ -1,48 +1,61 @@
1
- ---
2
- base_model:
3
- - Qwen/Qwen2.5-14B
4
- - deepseek-ai/DeepSeek-R1-Distill-Qwen-14B
5
- - Qwen/Qwen2.5-Coder-14B-Instruct
6
- library_name: transformers
7
- tags:
8
- - mergekit
9
- - merge
10
-
11
- ---
12
- # CoderO1-DeepSeekR1-Coder-14B-Preview
13
-
14
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
15
-
16
- GGUF files: [RDson/CoderO1-DeepSeekR1-Coder-14B-Preview-GGUF](https://huggingface.co/RDson/CoderO1-DeepSeekR1-Coder-14B-Preview-GGUF).
17
-
18
- This is based on the work of [FuseAI/FuseO1-DeepSeekR1-Qwen2.5-Instruct-32B-Preview](https://huggingface.co/FuseAI/FuseO1-DeepSeekR1-Qwen2.5-Instruct-32B-Preview).
19
-
20
- ## Merge Details
21
- ### Merge Method
22
-
23
- This model was merged using the sce merge method using [Qwen/Qwen2.5-14B](https://huggingface.co/Qwen/Qwen2.5-14B) as a base.
24
-
25
- ### Models Merged
26
-
27
- The following models were included in the merge:
28
- * [deepseek-ai/DeepSeek-R1-Distill-Qwen-14B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B)
29
- * [Qwen/Qwen2.5-Coder-14B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-14B-Instruct)
30
-
31
- ### Configuration
32
-
33
- The following YAML configuration was used to produce this model:
34
-
35
- ```yaml
36
- models:
37
- # Pivot model
38
- - model: Qwen/Qwen2.5-14B
39
- # Target models
40
- - model: Qwen/Qwen2.5-Coder-14B-Instruct
41
- - model: deepseek-ai/DeepSeek-R1-Distill-Qwen-14B
42
- merge_method: sce
43
- base_model: Qwen/Qwen2.5-14B
44
- parameters:
45
- select_topk: 1.0
46
- dtype: bfloat16
47
-
48
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model:
3
+ - Qwen/Qwen2.5-14B
4
+ - deepseek-ai/DeepSeek-R1-Distill-Qwen-14B
5
+ - Qwen/Qwen2.5-Coder-14B-Instruct
6
+ library_name: transformers
7
+ tags:
8
+ - mergekit
9
+ - merge
10
+ language:
11
+ - zho
12
+ - eng
13
+ - fra
14
+ - spa
15
+ - por
16
+ - deu
17
+ - ita
18
+ - rus
19
+ - jpn
20
+ - kor
21
+ - vie
22
+ - tha
23
+ - ara
24
+ ---
25
+ # CoderO1-DeepSeekR1-Coder-14B-Preview
26
+
27
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
28
+
29
+ GGUF files: [RDson/CoderO1-DeepSeekR1-Coder-14B-Preview-GGUF](https://huggingface.co/RDson/CoderO1-DeepSeekR1-Coder-14B-Preview-GGUF).
30
+
31
+ This is based on the work of [FuseAI/FuseO1-DeepSeekR1-Qwen2.5-Instruct-32B-Preview](https://huggingface.co/FuseAI/FuseO1-DeepSeekR1-Qwen2.5-Instruct-32B-Preview).
32
+
33
+ ## Merge Details
34
+ ### Merge Method
35
+
36
+ This model was merged using the sce merge method using [Qwen/Qwen2.5-14B](https://huggingface.co/Qwen/Qwen2.5-14B) as a base.
37
+
38
+ ### Models Merged
39
+
40
+ The following models were included in the merge:
41
+ * [deepseek-ai/DeepSeek-R1-Distill-Qwen-14B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B)
42
+ * [Qwen/Qwen2.5-Coder-14B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-14B-Instruct)
43
+
44
+ ### Configuration
45
+
46
+ The following YAML configuration was used to produce this model:
47
+
48
+ ```yaml
49
+ models:
50
+ # Pivot model
51
+ - model: Qwen/Qwen2.5-14B
52
+ # Target models
53
+ - model: Qwen/Qwen2.5-Coder-14B-Instruct
54
+ - model: deepseek-ai/DeepSeek-R1-Distill-Qwen-14B
55
+ merge_method: sce
56
+ base_model: Qwen/Qwen2.5-14B
57
+ parameters:
58
+ select_topk: 1.0
59
+ dtype: bfloat16
60
+
61
+ ```