Improve language tag

#1
by lbourdois - opened
Files changed (1) hide show
  1. README.md +64 -51
README.md CHANGED
@@ -1,51 +1,64 @@
1
- ---
2
- base_model:
3
- - deepseek-ai/DeepSeek-R1-Distill-Qwen-32B
4
- - Qwen/Qwen2.5-Coder-32B-Instruct
5
- - Qwen/Qwen2.5-32B
6
- library_name: transformers
7
- tags:
8
- - mergekit
9
- - merge
10
-
11
- ---
12
-
13
- # FuseAI uploaded their own Coder model here: [FuseAI/FuseO1-DeepSeekR1-Qwen2.5-Coder-32B-Preview](https://huggingface.co/FuseAI/FuseO1-DeepSeekR1-Qwen2.5-Coder-32B-Preview).
14
-
15
- GGUF files: [RDson/CoderO1-DeepSeekR1-Coder-32B-Preview-GGUF](https://huggingface.co/RDson/CoderO1-DeepSeekR1-Coder-32B-Preview-GGUF).
16
-
17
- *This is based on the works of [FuseAI/FuseO1-Preview](https://huggingface.co/collections/FuseAI/fuseo1-preview-678eb56093649b2688bc9977).*
18
-
19
- # CoderO1-DeepSeekR1-Coder-32B-Preview
20
-
21
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
22
-
23
- ## Merge Details
24
- ### Merge Method
25
-
26
- This model was merged using the sce merge method using [Qwen/Qwen2.5-32B](https://huggingface.co/Qwen/Qwen2.5-32B) as a base.
27
-
28
- ### Models Merged
29
-
30
- The following models were included in the merge:
31
- * [deepseek-ai/DeepSeek-R1-Distill-Qwen-32B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B)
32
- * [Qwen/Qwen2.5-Coder-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct)
33
-
34
- ### Configuration
35
-
36
- The following YAML configuration was used to produce this model:
37
-
38
- ```yaml
39
- models:
40
- # Pivot model
41
- - model: Qwen/Qwen2.5-32B
42
- # Target models
43
- - model: Qwen/Qwen2.5-Coder-32B-Instruct
44
- - model: deepseek-ai/DeepSeek-R1-Distill-Qwen-32B
45
- merge_method: sce
46
- base_model: Qwen/Qwen2.5-32B
47
- parameters:
48
- select_topk: 1.0
49
- dtype: bfloat16
50
-
51
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model:
3
+ - deepseek-ai/DeepSeek-R1-Distill-Qwen-32B
4
+ - Qwen/Qwen2.5-Coder-32B-Instruct
5
+ - Qwen/Qwen2.5-32B
6
+ library_name: transformers
7
+ tags:
8
+ - mergekit
9
+ - merge
10
+ language:
11
+ - zho
12
+ - eng
13
+ - fra
14
+ - spa
15
+ - por
16
+ - deu
17
+ - ita
18
+ - rus
19
+ - jpn
20
+ - kor
21
+ - vie
22
+ - tha
23
+ - ara
24
+ ---
25
+
26
+ # FuseAI uploaded their own Coder model here: [FuseAI/FuseO1-DeepSeekR1-Qwen2.5-Coder-32B-Preview](https://huggingface.co/FuseAI/FuseO1-DeepSeekR1-Qwen2.5-Coder-32B-Preview).
27
+
28
+ GGUF files: [RDson/CoderO1-DeepSeekR1-Coder-32B-Preview-GGUF](https://huggingface.co/RDson/CoderO1-DeepSeekR1-Coder-32B-Preview-GGUF).
29
+
30
+ *This is based on the works of [FuseAI/FuseO1-Preview](https://huggingface.co/collections/FuseAI/fuseo1-preview-678eb56093649b2688bc9977).*
31
+
32
+ # CoderO1-DeepSeekR1-Coder-32B-Preview
33
+
34
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
35
+
36
+ ## Merge Details
37
+ ### Merge Method
38
+
39
+ This model was merged using the sce merge method using [Qwen/Qwen2.5-32B](https://huggingface.co/Qwen/Qwen2.5-32B) as a base.
40
+
41
+ ### Models Merged
42
+
43
+ The following models were included in the merge:
44
+ * [deepseek-ai/DeepSeek-R1-Distill-Qwen-32B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B)
45
+ * [Qwen/Qwen2.5-Coder-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct)
46
+
47
+ ### Configuration
48
+
49
+ The following YAML configuration was used to produce this model:
50
+
51
+ ```yaml
52
+ models:
53
+ # Pivot model
54
+ - model: Qwen/Qwen2.5-32B
55
+ # Target models
56
+ - model: Qwen/Qwen2.5-Coder-32B-Instruct
57
+ - model: deepseek-ai/DeepSeek-R1-Distill-Qwen-32B
58
+ merge_method: sce
59
+ base_model: Qwen/Qwen2.5-32B
60
+ parameters:
61
+ select_topk: 1.0
62
+ dtype: bfloat16
63
+
64
+ ```