MLDataScientist
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -9,7 +9,7 @@ tags:
|
|
9 |
This is a 3bit AutoRound GPTQ version of Mistral-Large-Instruct-2407.
|
10 |
This conversion used model-*.safetensors.
|
11 |
|
12 |
-
This quantized model needs at least ~50GB + context (~5GB) VRAM. I quantized it so that it could fit 64GB VRAM.
|
13 |
|
14 |
Quantization script (it takes around 520 GB RAM and A40 GPU 48GB around 20 hours to convert):
|
15 |
```
|
|
|
9 |
This is a 3bit AutoRound GPTQ version of Mistral-Large-Instruct-2407.
|
10 |
This conversion used model-*.safetensors.
|
11 |
|
12 |
+
This quantized model needs at least ~ 50GB + context (~5GB) VRAM. I quantized it so that it could fit 64GB VRAM.
|
13 |
|
14 |
Quantization script (it takes around 520 GB RAM and A40 GPU 48GB around 20 hours to convert):
|
15 |
```
|