--- base_model: - SicariusSicariiStuff/Impish_QWEN_14B-1M - deepseek-ai/DeepSeek-R1-Distill-Qwen-14B - Qwen/Qwen2.5-14B-Instruct-1M - Zhihu-ai/Zhi-writing-dsr1-14b - huihui-ai/Qwen2.5-14B-Instruct-1M-abliterated - huihui-ai/Qwen2.5-14B-Instruct-abliterated-v2 - Qwen/Qwen2.5-14B-Instruct - mergekit-community/Qwen2.5-14B-della-code - tanliboy/lambda-qwen2.5-14b-dpo-test library_name: transformers tags: - mergekit - merge license: apache-2.0 language: - zho - eng - fra - spa - por - deu - ita - rus - jpn - kor - vie - tha - ara pipeline_tag: text-generation --- *From the preliminary test results, the effect is really excellent!!!* This is definitely a very promising method for model merging! The current optimal formula ratio is: instruction: reasoning: code = 6:2:1 # merge This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit). ## Merge Details ### Merge Method This model was merged using the [Karcher Mean](https://en.wikipedia.org/wiki/Karcher_mean) merge method using [Qwen/Qwen2.5-14B-Instruct](https://huggingface.co/Qwen/Qwen2.5-14B-Instruct) as a base. ### Models Merged The following models were included in the merge: **instruction:**(6) * [Qwen/Qwen2.5-14B-Instruct](https://huggingface.co/Qwen/Qwen2.5-14B-Instruct) * [Qwen/Qwen2.5-14B-Instruct-1M](https://huggingface.co/Qwen/Qwen2.5-14B-Instruct-1M) * [huihui-ai/Qwen2.5-14B-Instruct-1M-abliterated](https://huggingface.co/huihui-ai/Qwen2.5-14B-Instruct-1M-abliterated) * [huihui-ai/Qwen2.5-14B-Instruct-abliterated-v2](https://huggingface.co/huihui-ai/Qwen2.5-14B-Instruct-abliterated-v2) * [SicariusSicariiStuff/Impish_QWEN_14B-1M](https://huggingface.co/SicariusSicariiStuff/Impish_QWEN_14B-1M) * [tanliboy/lambda-qwen2.5-14b-dpo-test](https://huggingface.co/tanliboy/lambda-qwen2.5-14b-dpo-test) **reasoning:**(2) * [Zhihu-ai/Zhi-writing-dsr1-14b](https://huggingface.co/Zhihu-ai/Zhi-writing-dsr1-14b) * [deepseek-ai/DeepSeek-R1-Distill-Qwen-14B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B) **code:**(1) * [mergekit-community/Qwen2.5-14B-della-code](https://huggingface.co/mergekit-community/Qwen2.5-14B-della-code) ### Configuration The following YAML configuration was used to produce this model: ```yaml models: - model: mergekit-community/Qwen2.5-14B-della-code - model: Zhihu-ai/Zhi-writing-dsr1-14b - model: deepseek-ai/DeepSeek-R1-Distill-Qwen-14B - model: huihui-ai/Qwen2.5-14B-Instruct-1M-abliterated - model: huihui-ai/Qwen2.5-14B-Instruct-abliterated-v2 - model: Qwen/Qwen2.5-14B-Instruct-1M - model: SicariusSicariiStuff/Impish_QWEN_14B-1M - model: tanliboy/lambda-qwen2.5-14b-dpo-test - model: Qwen/Qwen2.5-14B-Instruct merge_method: karcher base_model: Qwen/Qwen2.5-14B-Instruct parameters: max_iter: 1000 normalize: true int8_mask: true tokenizer_source: base dtype: float16 ```