benk04
/

NoromaidxOpenGPT4-2-3.75bpw-h6-exl2

Text Generation

Not-For-All-Audiences

nsfw

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

benk04 commited on Jun 2

Commit

d94dc1d

•

1 Parent(s): e8a2040

Update README.md

Files changed (1) hide show

README.md +5 -4

README.md CHANGED Viewed

@@ -13,12 +13,13 @@ license: cc-by-nc-4.0
 ---
 <!-- description start -->
-My Exllamav2 3.75 bpw quantization of [NoromaidxOpenGPT4-2](https://huggingface.co/NeverSleep/NoromaidxOpenGPT4-2), quantized with default calibration dataset. Included is measurement json, so you can do your own quants.
 > [!IMPORTANT]
->This bpw is the perfect size for 24GB cards, and can fit 32k context. Make sure to enable 4-bit cache option.
->[!NOTE]
-> This model is great for rp and I recommend using the Alpaca presets in SillyTavern.
 ## Original Card
 ## Description

 ---
 <!-- description start -->
+Exllamav2 3.75bpw quantization of NoromaidxOpenGPT4-2 from [NeverSleep](https://huggingface.co/NeverSleep/NoromaidxOpenGPT4-2), quantized with default calibration dataset. Included is measurement json file, so you can do your own quants.
 > [!IMPORTANT]
+>This bpw is the perfect size for 24GB GPUs, and can fit 32k context. Make sure to enable 4-bit cache option or you'll run into OOM errors.
+> [!NOTE]
+> **Notes:**
+> This model is one of the better mixtral derivatives for rp, and I recommend using it with the Alpaca preset in SillyTavern.
 ## Original Card
 ## Description