arcee-ai
/

Mistral-7B-Instruct-v0.2-expanded

Text Generation

block expansion

progressive mistral

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Shamane commited on Mar 13, 2024

Commit

dd3df15

·

verified ·

1 Parent(s): 0a2f79e

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -9,9 +9,9 @@ tags:
 # Mistral-7B-Instruct-v0.2-expanded
-This method employs mergekit's passthrough method to expand blocks within the "mistralai/Mistral-7B-Instruct-v0.2" model. For every fourth layer,
 a new layer is added, with the `o_proj` and `down_proj` parameters of these added layers initialized to zero, mirroring the approach used in LLaMA Pro.
-It's important to note that this configuration has not undergone fine-tuning. Therefore, when fine-tuning, ensure that only every fourth layer is adjusted,
 while all other layers remain frozen.
 ## 🧩 Configuration

 # Mistral-7B-Instruct-v0.2-expanded
+This method employs mergekit's passthrough method to expand blocks within the "mistralai/Mistral-7B-Instruct-v0.2" model. For every 5th layer,
 a new layer is added, with the `o_proj` and `down_proj` parameters of these added layers initialized to zero, mirroring the approach used in LLaMA Pro.
+It's important to note that this configuration has not undergone fine-tuning. Therefore, when fine-tuning, ensure that only every 5th layer is trainable,
 while all other layers remain frozen.
 ## 🧩 Configuration