merge
This is a merge of pre-trained language models created using mergekit.
Merge Details
Merge Method
This model was merged using the DARE TIES merge method using CultriX/SeQwence-14Bv1 as a base.
Models Merged
The following models were included in the merge:
- sometimesanotion/Qwen2.5-14B-Vimarckoso
- CultriX/SeQwence-14B-EvolMerge
- CultriX/Qwen2.5-14B-SLERPv7
- qingy2019/Qwen2.5-Math-14B-Instruct
- allknowingroger/QwenSlerp6-14B
- CultriX/Qwen2.5-14B-Wernicke
- VAGOsolutions/SauerkrautLM-v2-14b-DPO
Configuration
The following YAML configuration was used to produce this model:
models:
- model: VAGOsolutions/SauerkrautLM-v2-14b-DPO
parameters:
weight: 0.20 # Strong IFEval and factual reasoning baseline
density: 0.6
- model: allknowingroger/QwenSlerp6-14B
parameters:
weight: 0.20 # Balanced reasoning across multiple benchmarks
density: 0.6
- model: CultriX/SeQwence-14B-EvolMerge
parameters:
weight: 0.15 # Generalist model for BBH and MUSR
density: 0.5
- model: CultriX/Qwen2.5-14B-Wernicke
parameters:
weight: 0.15 # QA leader for GPQA and MUSR
density: 0.6 # Increase density to preserve more QA-specific parameters
- model: qingy2019/Qwen2.5-Math-14B-Instruct
parameters:
weight: 0.15 # Specialist for MATH and advanced reasoning
density: 0.6
- model: sometimesanotion/Qwen2.5-14B-Vimarckoso
parameters:
weight: 0.10 # MUSR leader for nuanced multi-step reasoning
density: 0.5
- model: CultriX/Qwen2.5-14B-SLERPv7
parameters:
weight: 0.05 # Contextual reasoning support for BBH and tiny benchmarks
density: 0.5
base_model: CultriX/SeQwence-14Bv1
merge_method: dare_ties
parameters:
normalize: true
int8_mask: true
dtype: bfloat16
adaptive_merge_parameters:
task_weights:
IFEval: 1.3 # Enhanced instruction-following and factual tasks
BBH: 1.3 # Strengthened complex reasoning capabilities
MATH_Lvl_5: 1.4 # Prioritize advanced mathematical tasks
GPQA: 1.4 # Boost graduate-level knowledge capabilities
MuSR: 1.3 # Strengthen multi-step reasoning on complex tasks
MMLU_PRO: 1.2 # Ensure broad domain understanding
smoothing_factor: 0.15 # Sharper blending for reasoning and factual tasks
gradient_clipping: 0.9 # Tighter control for precise parameter scaling
- Downloads last month
- 96
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for CultriX/Qwen2.5-14B-Emergedv3
Merge model
this model