merge
This is a merge of pre-trained language models created using mergekit.
Merge Details
Merge Method
This model was merged using the DARE TIES merge method using CultriX/SeQwence-14Bv1 as a base.
Models Merged
The following models were included in the merge:
- CultriX/Qwen2.5-14B-Wernickev3
- CultriX/Qwen2.5-14B-Emergedv3
- qingy2019/Qwen2.5-Math-14B-Instruct
- CultriX/Qwen2.5-14B-FinalMerge
Configuration
The following YAML configuration was used to produce this model:
models:
- model: CultriX/Qwen2.5-14B-Wernickev3
parameters:
weight: 0.38 # Slight reduction to balance with FinalMerge's generalist capabilities.
density: 0.65 # Retain significant parameters for stability and strong task performance.
- model: CultriX/Qwen2.5-14B-FinalMerge
parameters:
weight: 0.32 # Slight increase to ensure its generalist capabilities are fully utilized.
density: 0.60 # Balanced density for comprehensive task coverage.
- model: CultriX/Qwen2.5-14B-Emergedv3
parameters:
weight: 0.20 # Retains focused contribution to specific task optimizations.
density: 0.55 # Moderate density ensures efficient parameter usage.
- model: qingy2019/Qwen2.5-Math-14B-Instruct
parameters:
weight: 0.10 # Consistent with its specialist focus, balancing lower weight with higher density.
density: 0.70 # High density ensures retention of advanced reasoning and MATH-related parameters.
merge_method: dare_ties
base_model: CultriX/SeQwence-14Bv1
parameters:
normalize: true # Ensures all models are scaled to compatible parameter ranges.
int8_mask: true # Optimizes memory and computational efficiency without accuracy loss.
dtype: bfloat16 # Provides better memory efficiency and numerical stability.
adaptive_merge_parameters:
task_weights:
tinyArc: 1.3 # Slight reduction to balance with generalist contributions.
tinyHellaswag: 1.3 # Maintains strong performance in contextual reasoning.
tinyMMLU: 1.2 # Balanced focus for domain-specific knowledge.
tinyTruthfulQA: 1.2 # Adjusted to ensure fair contribution without over-prioritization.
tinyTruthfulQA_mc1: 1.1 # Maintains a moderate priority to balance with other tiny benchmarks.
tinyWinogrande: 1.2 # Strong contextual reasoning support from generalist models.
IFEval: 1.5 # High weight for general instruction-following capabilities.
BBH: 1.5 # Prioritizes complex reasoning and multi-step problem-solving tasks.
MATH: 1.55 # Slight reduction to balance MATH with other advanced reasoning benchmarks.
GPQA: 1.4 # Balanced to reflect contributions from both generalist and specialist models.
MUSR: 1.4 # Increased slightly to strengthen multi-step reasoning.
MMLU-PRO: 1.3 # Maintains general task performance across multitask domain knowledge.
smoothing_factor: 0.18 # Slightly increased for smoother blending across task boundaries.
gradient_clipping: 0.88 # Tightened slightly for stability, preventing parameter over-contribution.
- Downloads last month
- 10
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for CultriX/Qwen2.5-14B-FinalMergev2
Merge model
this model