Shamane commited on
Commit
dd3df15
·
verified ·
1 Parent(s): 0a2f79e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -9,9 +9,9 @@ tags:
9
 
10
  # Mistral-7B-Instruct-v0.2-expanded
11
 
12
- This method employs mergekit's passthrough method to expand blocks within the "mistralai/Mistral-7B-Instruct-v0.2" model. For every fourth layer,
13
  a new layer is added, with the `o_proj` and `down_proj` parameters of these added layers initialized to zero, mirroring the approach used in LLaMA Pro.
14
- It's important to note that this configuration has not undergone fine-tuning. Therefore, when fine-tuning, ensure that only every fourth layer is adjusted,
15
  while all other layers remain frozen.
16
 
17
  ## 🧩 Configuration
 
9
 
10
  # Mistral-7B-Instruct-v0.2-expanded
11
 
12
+ This method employs mergekit's passthrough method to expand blocks within the "mistralai/Mistral-7B-Instruct-v0.2" model. For every 5th layer,
13
  a new layer is added, with the `o_proj` and `down_proj` parameters of these added layers initialized to zero, mirroring the approach used in LLaMA Pro.
14
+ It's important to note that this configuration has not undergone fine-tuning. Therefore, when fine-tuning, ensure that only every 5th layer is trainable,
15
  while all other layers remain frozen.
16
 
17
  ## 🧩 Configuration