cognitivecomputations
/

Dolphin-2.9.1-Phi-3-Kensho-4.5B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Crystalcareai commited on May 8, 2024

Commit

7e14c20

·

verified ·

1 Parent(s): 714b89b

Update README.md

Files changed (1) hide show

README.md +3 -0

README.md CHANGED Viewed

@@ -28,6 +28,9 @@ This model utilizes PEFT layer replication at inference time to duplicate layers
 and the adapter that is attached as well. Performance will be similar with both methods, but VRAM use is considerably less when using the adapter.
 This model was initialized using [Unsloth's Mistralfied Phi-3-Instruct-4k](https://huggingface.co/unsloth/Phi-3-mini-4k-instruct). If you choose to use the adapter method, please attach it their model.
 This model is based on Phi-3-Mini-Instruct-4k, and is governed by the MIT license in which Microsoft released Phi-3.
 The base model has 4k context, and the qLoRA fine-tuning was with 4k sequence length.

 and the adapter that is attached as well. Performance will be similar with both methods, but VRAM use is considerably less when using the adapter.
 This model was initialized using [Unsloth's Mistralfied Phi-3-Instruct-4k](https://huggingface.co/unsloth/Phi-3-mini-4k-instruct). If you choose to use the adapter method, please attach it their model.
+![Image](https://i.ibb.co/C6sqLBH/Vram-Use.png)
 This model is based on Phi-3-Mini-Instruct-4k, and is governed by the MIT license in which Microsoft released Phi-3.
 The base model has 4k context, and the qLoRA fine-tuning was with 4k sequence length.