Great merge!

#1
by CosmossG - opened

Hello! Just wanted to congratulate you on this banger merge. Have you tried out merging Broken Tutu Unslop V2? I think that would make this model even better. But still, congrats, and thanks for the model!

Thanks! Yes I tested and considered some other models like mistral thinker by undi, broken tutu and forgotten safeword. But ran into issues (repeating nonsense) when combining Mistral 2501 finetunes with 2503/2506. So I also had to omit dolphin venice and blacksheep. It seems the architecture difference in mistral 3.0 to 3.1 breaks the merges ,at least with the prototypes I tested it did.

Ah, I see. Are you planning on releasing a merge for those models as well? Would be interesting to see the quality of them.

Yes, the v0b prototype was a test of just four 2501 finetunes. I can upload that after bandwidth resets. I ran a few tests and found v0c more promising than v0b, it's basically v0e without Harbinger and Omega.

I have plans to improve the 2501 merge further though, adding models like broken tutu and other unique 2501 finetunes. This might be released as Kraken-24B.

model v0b uses this template

base_model: cognitivecomputations/Dolphin-Mistral-24B-Venice-Edition
merge_method: dare_ties
dtype: bfloat16
models:
  - model: cognitivecomputations/Dolphin-Mistral-24B-Venice-Edition
    parameters:
      density: 0.5
      weight: 0.25
  - model: ATroyDoesAI/BlackSheep-24B
    parameters:
      density: 0.5
      weight: 0.25
  - model: ReadyArt/Forgotten-Safeword-24B-v4.0
    parameters:
      density: 0.5
      weight: 0.25
  - model: Undi95/MistralThinker-v1.1
    parameters:
      density: 0.5
      weight: 0.25
tokenizer:
source: union
chat_template: auto

This started with my first prototype v0a, which resulted in broken output after a few messages. The only solution was splitting the 2501 and 2503/2506 into seperate merges.

base_model: cognitivecomputations/Dolphin-Mistral-24B-Venice-Edition
merge_method: dare_ties
dtype: bfloat16
models:
  - model: cognitivecomputations/Dolphin-Mistral-24B-Venice-Edition
    parameters:
      density: 0.5
      weight: 0.1
  - model: TroyDoesAI/BlackSheep-24B
    parameters:
      density: 0.5
      weight: 0.1
  - model: PocketDoc/Dans-PersonalityEngine-V1.3.0-24b
    parameters:
      density: 0.5
      weight: 0.1
  - model: TheDrummer/Cydonia-24B-v3.1
    parameters:
      density: 0.5
      weight: 0.1
  - model: Gryphe/Codex-24B-Small-3.2
    parameters:
      density: 0.5
      weight: 0.1
  - model: Doctor-Shotgun/MS3.2-24B-Magnum-Diamond
    parameters:
      density: 0.5
      weight: 0.1
  - model: Undi95/MistralThinker-v1.1
    parameters:
      density: 0.5
      weight: 0.1
  - model: aixonlab/Eurydice-24b-v3.5
    parameters:
      density: 0.5
      weight: 0.1
  - model: ReadyArt/Forgotten-Safeword-24B-v4.0
    parameters:
      density: 0.5
      weight: 0.1
  - model: SicariusSicariiStuff/Impish_Magic_24B
    parameters:
      density: 0.5
      weight: 0.1
tokenizer:
source: union
chat_template: auto

Sign up or log in to comment