--- license: apache-2.0 language: - zho - eng - fra - spa - por - deu - ita - rus - jpn - kor - vie - tha - ara base_model: - Qwen/Qwen2.5-14B - Qwen/Qwen2.5-14B-Instruct - Qwen/Qwen2.5-14B-Instruct-1M - tanliboy/lambda-qwen2.5-14b-dpo-test - arcee-ai/SuperNova-Medius - arcee-ai/Virtuoso-Small-v2 - Azure99/Blossom-V6-14B - Qwen/Qwen2.5-Coder-14B - Qwen/Qwen2.5-Coder-14B-Instruct - deepseek-ai/DeepSeek-R1-Distill-Qwen-14B - huihui-ai/DeepSeek-R1-Distill-Qwen-14B-abliterated-v2 pipeline_tag: text-generation tags: - merge --- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64e174e202fa032de4143324/zx2LWe9rip2AVr76BH4Er.png) # Qwen2.5-14B-YOYO-V4 *[Qwen2.5-14B-YOYO-V5 Officially Released!](https://huggingface.co/YOYO-AI/Qwen2.5-14B-YOYO-V5)* **Key Highlights:** *1. Richer Knowledge & Improved Instruction Compliance* *2. Integrated Code Model and R1 Distillation for Improved Coding/Reasoning* *3. 1M-Token Long Context Window* ## First stage: ```yaml merge_method: sce models: # Pivot model - model: Qwen/Qwen2.5-14B-Instruct-1M # Target models - model: Qwen/Qwen2.5-14B base_model: Qwen/Qwen2.5-14B-Instruct-1M parameters: select_topk: 1 dtype: bfloat16 tokenizer_source: base normalize: true int8_mask: true name: Qwen2.5-14B-1M ``` ```yaml models: - model: tanliboy/lambda-qwen2.5-14b-dpo-test parameters: density: 1 weight: 1 lambda: 0.9 merge_method: della base_model: Qwen2.5-14B-1M parameters: density: 1 weight: 1 lambda: 0.9 normalize: true int8_mask: true dtype: bfloat16 tokenizer_source: base name: Qwen2.5-14B-1M-della ``` ## Second stage: ```yaml models: - model: Qwen/Qwen2.5-14B-Instruct parameters: density: 1 weight: 1 lambda: 0.9 - model: Qwen/Qwen2.5-14B-Instruct-1M parameters: density: 1 weight: 1 lambda: 0.9 merge_method: della base_model: arcee-ai/Virtuoso-Small-v2 parameters: density: 1 weight: 1 lambda: 0.9 normalize: true int8_mask: true dtype: bfloat16 tokenizer_source: base name: Qwen2.5-14B-YOYO-della1 ``` ```yaml models: - model: Qwen/Qwen2.5-14B-Instruct parameters: density: 1 weight: 1 lambda: 0.9 - model: Qwen/Qwen2.5-14B-Instruct-1M parameters: density: 1 weight: 1 lambda: 0.9 merge_method: della base_model: arcee-ai/SuperNova-Medius parameters: density: 1 weight: 1 lambda: 0.9 normalize: true int8_mask: true dtype: bfloat16 tokenizer_source: base name: Qwen2.5-14B-YOYO-della2 ``` ```yaml models: - model: Qwen/Qwen2.5-14B-Instruct parameters: density: 1 weight: 1 lambda: 0.9 - model: Qwen/Qwen2.5-14B-Instruct-1M parameters: density: 1 weight: 1 lambda: 0.9 merge_method: della base_model: Azure99/Blossom-V6-14B parameters: density: 1 weight: 1 lambda: 0.9 normalize: true int8_mask: true dtype: bfloat16 tokenizer_source: base name: Qwen2.5-14B-YOYO-della3 ``` ## Third stage: ### Step 1: ```yaml models: - model: Qwen/Qwen2.5-Coder-14B-Instruct parameters: density: 1 weight: 1 lambda: 0.9 merge_method: della base_model: Qwen/Qwen2.5-Coder-14B parameters: density: 1 weight: 1 lambda: 0.9 normalize: true int8_mask: true dtype: bfloat16 tokenizer_source: base name: Qwen2.5-Coder-14B-della ``` ### Step 2: ```yaml merge_method: model_stock base_model: Qwen/Qwen2.5-14B-Instruct models: - model: Qwen2.5-Coder-14B-della - model: arcee-ai/Virtuoso-Small-v2 - model: deepseek-ai/DeepSeek-R1-Distill-Qwen-14B - model: huihui-ai/DeepSeek-R1-Distill-Qwen-14B-abliterated-v2 dtype: bfloat16 tokenizer_source: base int8_mask: true normalize: true name: Qwen2.5-14B-mst ``` ## Final stage: ```yaml merge_method: model_stock base_model: Qwen2.5-14B-1M-della models: - model: Qwen2.5-14B-della1 - model: Qwen2.5-14B-della2 - model: Qwen2.5-14B-della3 - model: Qwen2.5-14B-mst dtype: bfloat16 tokenizer_source: base int8_mask: true normalize: true name: YOYO-AI/Qwen2.5-14B-YOYO-V4 ```