merged_llm

This is a merge of pre-trained language models created using mergekit.

Merge Details

Merge Method

This model was merged using the linear DELLA merge method using alexxi19/ft-v1-nemo-base as a base.

Models Merged

The following models were included in the merge:

Configuration

The following YAML configuration was used to produce this model:

# models:
#   - model: anthracite-org/magnum-v2-12b  # instruct model
#     parameters:
#       density: 0.6
#       weight: 0.5
#   # - model: /home/paperspace/projects/project/finetunellm/outputs/nemo-12b-creative/merged  # creative writing model
#   #   parameters:
#   #     density: 0.3
#   #     weight: 0.3
#   - model: alexxi19/ft-nemo-base-lora # sft model
#     parameters:
#       density: 0.7
#       weight: 0.5
# merge_method: dare_ties
# base_model: alexxi19/ft-nemo-base-lora
# parameters:
#   int8_mask: true
#   rescale: true
#   # normalize: true
# dtype: bfloat16
# chat_template: chatml


models:
  - model: anthracite-org/magnum-v2-12b  # instruct model
    parameters:
      density: 0.3
      weight: 0.5
  # - model: Nitral-AI/Captain_BMO-12B  # instruct model
  #   parameters:
  #     density: 0.3
  #     weight: 0.5
  - model: Nitral-AI/Captain-Eris_Violet-V0.420-12B  # creative writing model
    parameters:
      density: 0.2
      weight: 0.3
  - model: alexxi19/ft-v1-nemo-base # sft model
    parameters:
      density: 0.5
      weight: 0.5
base_model: alexxi19/ft-v1-nemo-base
dtype: bfloat16
chat_template: chatml
merge_method: della_linear
parameters:
  epsilon: 0.05
  int8_mask: true
  rescale: true
  lambda: 1.0


# models:
#   - model: Nitral-AI/Captain_BMO-12B # another sft model
#   - model: alexxi19/ft-v1-nemo-base # sft model
# merge_method: slerp
# base_model: alexxi19/ft-v1-nemo-base
# parameters:
#   t:
#     - filter: self_attn
#       value: [0, 0.5, 0.3, 0.7, 1]
#     - filter: mlp
#       value: [1, 0.5, 0.7, 0.3, 0]
#     - value: 0.3 # fallback for rest of tensors
# dtype: float16
# chat_template: chatml
Downloads last month
4
Safetensors
Model size
12.2B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for alexxi19/ft-v1-nemo-base-merge-v1