askmyteapot
commited on
Commit
•
3c9170e
1
Parent(s):
1536602
Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,21 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
## This is a 4bit quant of https://huggingface.co/Aeala/GPT4-x-AlpacaDente2-30b
|
2 |
+
|
3 |
+
|
4 |
+
|
5 |
+
# My secret sauce:
|
6 |
+
* Using comit <a href="https://github.com/0cc4m/GPTQ-for-LLaMa/tree/3c16fd9c7946ebe85df8d951cb742adbc1966ec7">3c16fd9</a> of 0cc4m's GPTQ fork
|
7 |
+
* Using PTB as the calibration dataset
|
8 |
+
* Act-order, True-sequential, perfdamp 0.1
|
9 |
+
(<i>the default perfdamp is 0.01</i>)
|
10 |
+
* No groupsize
|
11 |
+
* Will run with CUDA, does not need triton.
|
12 |
+
* Quant completed on a 'Premium GPU' and 'High Memory' Google Colab.
|
13 |
+
|
14 |
+
## Benchmark results
|
15 |
+
|
16 |
+
|<b>Model<b>|<b>C4<b>|<b>WikiText2<b>|<b>PTB<b>|
|
17 |
+
|:---:|---|---|---|
|
18 |
+
|This Quant|x.xxxxxx|x.xxxxxx|x.xxxxxx|
|
19 |
+
|Aela's Quant <a href="https://huggingface.co/Aeala/GPT4-x-AlpacaDente2-30b/resolve/main/4bit.safetensors">here</a>|x.xxxxxx|x.xxxxxx|x.xxxxxx|
|
20 |
+
|
21 |
+
|