File size: 820 Bytes
3c9170e b89fd23 3c9170e |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
## This is a 4bit quant of https://huggingface.co/Aeala/GPT4-x-AlpacaDente2-30b
# My secret sauce:
* Using comit <a href="https://github.com/0cc4m/GPTQ-for-LLaMa/tree/3c16fd9c7946ebe85df8d951cb742adbc1966ec7">3c16fd9</a> of 0cc4m's GPTQ fork
* Using PTB as the calibration dataset
* Act-order, True-sequential, perfdamp 0.1
(<i>the default perfdamp is 0.01</i>)
* No groupsize
* Will run with CUDA, does not need triton.
* Quant completed on a 'Premium GPU' and 'High Memory' Google Colab.
## Benchmark results
|<b>Model<b>|<b>C4<b>|<b>WikiText2<b>|<b>PTB<b>|
|:---:|---|---|---|
|This Quant|7.326207160949707|4.957101345062256|24.941526412963867|
|Aela's Quant <a href="https://huggingface.co/Aeala/GPT4-x-AlpacaDente2-30b/resolve/main/4bit.safetensors">here</a>|x.xxxxxx|x.xxxxxx|x.xxxxxx|
|