Update README.md
Browse files
README.md
CHANGED
@@ -15,6 +15,8 @@ It leverages the open-source GPTQModel quantization to achieve 4-bit precision w
|
|
15 |
smaller,
|
16 |
faster model with minimal performance degradation.
|
17 |
|
|
|
|
|
18 |
Ran on a single NVIDIA A100 GPU with 80GB of VRAM.
|
19 |
|
20 |
*Note* `batch_size` is set quite high as the model is small, you may need to adjust this to your GPU VRAM.
|
|
|
15 |
smaller,
|
16 |
faster model with minimal performance degradation.
|
17 |
|
18 |
+
NOTE: High perplexity, maybe due to MSE. Non-MSE quant either present or coming.
|
19 |
+
|
20 |
Ran on a single NVIDIA A100 GPU with 80GB of VRAM.
|
21 |
|
22 |
*Note* `batch_size` is set quite high as the model is small, you may need to adjust this to your GPU VRAM.
|