--- base_model: - mergekit-community/Qwen2.5-14B-della-V6-dpo - mergekit-community/Qwen2.5-14B-della-Nova-dpo - agentica-org/DeepCoder-14B-Preview - mergekit-community/Qwen2.5-14B-della-base-dpo - mergekit-community/Qwen2.5-14B-della-1M-dpo - Zhihu-ai/Zhi-writing-dsr1-14b - mergekit-community/Qwen2.5-14B-della-v2-dpo - mergekit-community/Qwen2.5-14B-della-code library_name: transformers tags: - mergekit - merge --- # merge This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit). ## Merge Details ### Merge Method This model was merged using the [Karcher Mean](https://en.wikipedia.org/wiki/Karcher_mean) merge method using [mergekit-community/Qwen2.5-14B-della-1M-dpo](https://huggingface.co/mergekit-community/Qwen2.5-14B-della-1M-dpo) as a base. ### Models Merged The following models were included in the merge: * [mergekit-community/Qwen2.5-14B-della-V6-dpo](https://huggingface.co/mergekit-community/Qwen2.5-14B-della-V6-dpo) * [mergekit-community/Qwen2.5-14B-della-Nova-dpo](https://huggingface.co/mergekit-community/Qwen2.5-14B-della-Nova-dpo) * [agentica-org/DeepCoder-14B-Preview](https://huggingface.co/agentica-org/DeepCoder-14B-Preview) * [mergekit-community/Qwen2.5-14B-della-base-dpo](https://huggingface.co/mergekit-community/Qwen2.5-14B-della-base-dpo) * [Zhihu-ai/Zhi-writing-dsr1-14b](https://huggingface.co/Zhihu-ai/Zhi-writing-dsr1-14b) * [mergekit-community/Qwen2.5-14B-della-v2-dpo](https://huggingface.co/mergekit-community/Qwen2.5-14B-della-v2-dpo) * [mergekit-community/Qwen2.5-14B-della-code](https://huggingface.co/mergekit-community/Qwen2.5-14B-della-code) ### Configuration The following YAML configuration was used to produce this model: ```yaml models: - model: Zhihu-ai/Zhi-writing-dsr1-14b - model: agentica-org/DeepCoder-14B-Preview - model: mergekit-community/Qwen2.5-14B-della-code - model: mergekit-community/Qwen2.5-14B-della-v2-dpo - model: mergekit-community/Qwen2.5-14B-della-V6-dpo - model: mergekit-community/Qwen2.5-14B-della-Nova-dpo - model: mergekit-community/Qwen2.5-14B-della-base-dpo - model: mergekit-community/Qwen2.5-14B-della-1M-dpo merge_method: karcher base_model: mergekit-community/Qwen2.5-14B-della-1M-dpo parameters: max_iter: 1000 tokenizer_source: base dtype: float16 int8_mask: true normalize: true ```