EmbeddedLLM
/

Mistral-7B-Merge-02-v0

Text Generation

text-generation-inference

Model card Files Files and versions

thesunday commited on Dec 20, 2023

Commit

b142b88

·

1 Parent(s): 7d92ce6

Update model card

Files changed (1) hide show

README.md +37 -0

README.md CHANGED Viewed

@@ -1,3 +1,40 @@
 ---
 license: apache-2.0
 ---

 ---
 license: apache-2.0
+language:
+- en
+tags:
+- merge
 ---
+# Model Description
+This is an experiment to compare merging 2 models using DARE TIES versus SLERP 🦙
+We are mainly interested to compare against [Weyaxi/OpenHermes-2.5-neural-chat-v3-3-Slerp](https://huggingface.co/Weyaxi/OpenHermes-2.5-neural-chat-v3-3-Slerp)
+The 2 models involved in the merge as follows:
+1. [teknium/OpenHermes-2.5-Mistral-7B](https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B)
+2. [Intel/neural-chat-7b-v3-3](https://huggingface.co/Intel/neural-chat-7b-v3-3)
+- base model: [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1)
+The yaml config file for the merge is:
+```yaml
+models:
+  - model: mistralai/Mistral-7B-v0.1
+    # no parameters necessary for base model
+  - model: teknium/OpenHermes-2.5-Mistral-7B
+    parameters:
+      weight: 0.5
+      density: 0.5
+  - model: Intel/neural-chat-7b-v3-3
+    parameters:
+      weight: 0.5
+      density: 0.5
+merge_method: dare_ties
+base_model: mistralai/Mistral-7B-v0.1
+parameters:
+  int8_mask: true
+dtype: bfloat16
+```