lemonilia
/

LimaRP-Mistral-7B-v0.1

Text Generation

text-generation-inference

Model card Files Files and versions Community

lemonilia commited on Oct 16, 2023

Commit

b88be3a

·

1 Parent(s): 28322a9

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -101,9 +101,9 @@ repetition penalty and low penalty range (about as long as the prior 2 messages)
 ## Training procedure
 [Axolotl](https://github.com/OpenAccess-AI-Collective/axolotl) was used for training
-on a 4x NVidia A40 GPU cluster.
-The A40 GPU cluster has been graciously provided by [Arc Compute](https://www.arccompute.io/).
 The model has been trained as an 8-bit LoRA adapter, and
 it's so large because a LoRA rank of 256 was also used. The reasoning was that this
@@ -133,4 +133,4 @@ the base Mistral-7B-v0.1 model.
 For the second pass, the `lora_model_dir` option was used to continue finetuning on the LoRA
 adapter obtained from the first pass.
-Using 4 GPUs, the effective global batch size would have been 128.

 ## Training procedure
 [Axolotl](https://github.com/OpenAccess-AI-Collective/axolotl) was used for training
+on 2x NVidia A40 GPUs.
+The A40 GPUs have been graciously provided by [Arc Compute](https://www.arccompute.io/).
 The model has been trained as an 8-bit LoRA adapter, and
 it's so large because a LoRA rank of 256 was also used. The reasoning was that this
 For the second pass, the `lora_model_dir` option was used to continue finetuning on the LoRA
 adapter obtained from the first pass.
+Using 2 GPUs, the effective global batch size would have been 128.