Megatron-Opus-14B-Stock

[ Megatron+Primal+Elite2 ] is based on the Qwen 2.5 14B modality architecture, designed to enhance the reasoning capabilities of 14B-parameter models. It has been fine-tuned on a Synthetic dataset entries based on one half of Qwen’s QWQ and DeepSeek R1, further optimizing its chain-of-thought (CoT) reasoning and logical problem-solving abilities. The model demonstrates significant improvements in context understanding, structured data processing, and long-context comprehension, making it ideal for complex reasoning tasks, instruction-following, and text generation.

merge

This is a merge of pre-trained language models created using mergekit.

Merge Method

This model was merged using the Model Stock merge method using prithivMLmods/Megatron-Opus-14B-Exp as a base.

Models Merged

The following models were included in the merge:

Configuration

The following YAML configuration was used to produce this model:

merge_method:        model_stock
base_model:          prithivMLmods/Megatron-Opus-14B-Exp
tokenizer_source:    base
dtype:               bfloat16
out_dtype:           bfloat16
parameters:
  int8_mask:         true
  normalize:         true
  rescale:           false
models:
  - model:           prithivMLmods/Megatron-Opus-14B-Exp
  - model:           prithivMLmods/Primal-Opus-14B-Optimus-v1
  - model:           prithivMLmods/Calcium-Opus-14B-Elite2-R1