3ed0k4 commited on
Commit
f0dbe60
·
verified ·
1 Parent(s): 210ad5b

Upload folder using huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +51 -0
README.md ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - merge
5
+ - mergekit
6
+ - lazymergekit
7
+ - prithivMLmods/Qwen2.5-14B-DeepSeek-R1-1M
8
+ - deepseek-ai/DeepSeek-R1-Distill-Llama-8B
9
+ ---
10
+
11
+ # KRDModel
12
+
13
+ KRDModel is a merge of the following models using [mergekit](https://github.com/cg123/mergekit):
14
+ * [prithivMLmods/Qwen2.5-14B-DeepSeek-R1-1M](https://huggingface.co/prithivMLmods/Qwen2.5-14B-DeepSeek-R1-1M)
15
+ * [deepseek-ai/DeepSeek-R1-Distill-Llama-8B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-8B)
16
+
17
+ ## 🧩 Configuration
18
+
19
+ ```yaml
20
+ slices:
21
+ - sources:
22
+ - model: prithivMLmods/Qwen2.5-14B-DeepSeek-R1-1M
23
+ layer_range:
24
+ - 0
25
+ - 32
26
+ - model: deepseek-ai/DeepSeek-R1-Distill-Llama-8B
27
+ layer_range:
28
+ - 0
29
+ - 32
30
+ merge_method: slerp
31
+ base_model: prithivMLmods/Qwen2.5-14B-DeepSeek-R1-1M
32
+ parameters:
33
+ t:
34
+ - filter: self_attn
35
+ value:
36
+ - 0
37
+ - 0.5
38
+ - 0.3
39
+ - 0.7
40
+ - 1
41
+ - filter: mlp
42
+ value:
43
+ - 1
44
+ - 0.5
45
+ - 0.7
46
+ - 0.3
47
+ - 0
48
+ - value: 0.5
49
+ dtype: bfloat16
50
+
51
+ ```