Great merge!
Hello! Just wanted to congratulate you on this banger merge. Have you tried out merging Broken Tutu Unslop V2? I think that would make this model even better. But still, congrats, and thanks for the model!
Thanks! Yes I tested and considered some other models like mistral thinker by undi, broken tutu and forgotten safeword. But ran into issues (repeating nonsense) when combining Mistral 2501 finetunes with 2503/2506. So I also had to omit dolphin venice and blacksheep. It seems the architecture difference in mistral 3.0 to 3.1 breaks the merges ,at least with the prototypes I tested it did.
Ah, I see. Are you planning on releasing a merge for those models as well? Would be interesting to see the quality of them.
Yes, the v0b prototype was a test of just four 2501 finetunes. I can upload that after bandwidth resets. I ran a few tests and found v0c more promising than v0b, it's basically v0e without Harbinger and Omega.
I have plans to improve the 2501 merge further though, adding models like broken tutu and other unique 2501 finetunes. This might be released as Kraken-24B.
model v0b uses this template
base_model: cognitivecomputations/Dolphin-Mistral-24B-Venice-Edition
merge_method: dare_ties
dtype: bfloat16
models:
- model: cognitivecomputations/Dolphin-Mistral-24B-Venice-Edition
parameters:
density: 0.5
weight: 0.25
- model: ATroyDoesAI/BlackSheep-24B
parameters:
density: 0.5
weight: 0.25
- model: ReadyArt/Forgotten-Safeword-24B-v4.0
parameters:
density: 0.5
weight: 0.25
- model: Undi95/MistralThinker-v1.1
parameters:
density: 0.5
weight: 0.25
tokenizer:
source: union
chat_template: auto
This started with my first prototype v0a, which resulted in broken output after a few messages. The only solution was splitting the 2501 and 2503/2506 into seperate merges.
base_model: cognitivecomputations/Dolphin-Mistral-24B-Venice-Edition
merge_method: dare_ties
dtype: bfloat16
models:
- model: cognitivecomputations/Dolphin-Mistral-24B-Venice-Edition
parameters:
density: 0.5
weight: 0.1
- model: TroyDoesAI/BlackSheep-24B
parameters:
density: 0.5
weight: 0.1
- model: PocketDoc/Dans-PersonalityEngine-V1.3.0-24b
parameters:
density: 0.5
weight: 0.1
- model: TheDrummer/Cydonia-24B-v3.1
parameters:
density: 0.5
weight: 0.1
- model: Gryphe/Codex-24B-Small-3.2
parameters:
density: 0.5
weight: 0.1
- model: Doctor-Shotgun/MS3.2-24B-Magnum-Diamond
parameters:
density: 0.5
weight: 0.1
- model: Undi95/MistralThinker-v1.1
parameters:
density: 0.5
weight: 0.1
- model: aixonlab/Eurydice-24b-v3.5
parameters:
density: 0.5
weight: 0.1
- model: ReadyArt/Forgotten-Safeword-24B-v4.0
parameters:
density: 0.5
weight: 0.1
- model: SicariusSicariiStuff/Impish_Magic_24B
parameters:
density: 0.5
weight: 0.1
tokenizer:
source: union
chat_template: auto