--- license: apache-2.0 tags: - merge - mergekit - lazymergekit - prithivMLmods/Qwen2.5-14B-DeepSeek-R1-1M - deepseek-ai/DeepSeek-R1-Distill-Llama-8B --- # KRDModel KRDModel is a merge of the following models using [mergekit](https://github.com/cg123/mergekit): * [prithivMLmods/Qwen2.5-14B-DeepSeek-R1-1M](https://huggingface.co/prithivMLmods/Qwen2.5-14B-DeepSeek-R1-1M) * [deepseek-ai/DeepSeek-R1-Distill-Llama-8B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-8B) ## 🧩 Configuration ```yaml slices: - sources: - model: prithivMLmods/Qwen2.5-14B-DeepSeek-R1-1M layer_range: - 0 - 32 - model: deepseek-ai/DeepSeek-R1-Distill-Llama-8B layer_range: - 0 - 32 merge_method: slerp base_model: prithivMLmods/Qwen2.5-14B-DeepSeek-R1-1M parameters: t: - filter: self_attn value: - 0 - 0.5 - 0.3 - 0.7 - 1 - filter: mlp value: - 1 - 0.5 - 0.7 - 0.3 - 0 - value: 0.5 dtype: bfloat16 ```