JustJaro commited on
Commit
6fb916d
·
verified ·
1 Parent(s): 8e63c09

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -11,9 +11,9 @@ tags:
11
  # 🔥 Quantized Model: Mistral-Small-24B-Instruct-2501_gptq_g32_4bit 🔥
12
 
13
  This is a 4-bit quantized version of [mistralai/Mistral-Small-24B-Instruct-2501](https://huggingface.co/mistralai/Mistral-Small-24B-Instruct-2501) model, quantized by [ConfidentialMind.com](https://www.confidentialmind.com) 🤖✨
14
- It leverages the open-source GPTQModel quantization to achieve 4-bit precision with a group size of 128 resulting in a
15
  smaller,
16
- faster model with minimal performance degradation.
17
 
18
  Ran on a single NVIDIA A100 GPU with 80GB of VRAM.
19
 
 
11
  # 🔥 Quantized Model: Mistral-Small-24B-Instruct-2501_gptq_g32_4bit 🔥
12
 
13
  This is a 4-bit quantized version of [mistralai/Mistral-Small-24B-Instruct-2501](https://huggingface.co/mistralai/Mistral-Small-24B-Instruct-2501) model, quantized by [ConfidentialMind.com](https://www.confidentialmind.com) 🤖✨
14
+ It leverages the open-source GPTQModel quantization to achieve 4-bit precision with a group size of 32 resulting in a
15
  smaller,
16
+ faster model with minimal performance degradation. The G128 variant used MSE loss in order to avoid performance degredation.
17
 
18
  Ran on a single NVIDIA A100 GPU with 80GB of VRAM.
19