askmyteapot
/

GPT4-x-AlpacaDente2-30b-4bit

Text Generation

Inference Endpoints

Model card Files Files and versions Community

askmyteapot commited on May 5, 2023

Commit

3c9170e

•

1 Parent(s): 1536602

Create README.md

Files changed (1) hide show

README.md +21 -0

README.md ADDED Viewed

	@@ -0,0 +1,21 @@

+## This is a 4bit quant of https://huggingface.co/Aeala/GPT4-x-AlpacaDente2-30b
+# My secret sauce:
+  * Using comit <a href="https://github.com/0cc4m/GPTQ-for-LLaMa/tree/3c16fd9c7946ebe85df8d951cb742adbc1966ec7">3c16fd9</a> of 0cc4m's GPTQ fork
+  * Using PTB as the calibration dataset
+  * Act-order, True-sequential, perfdamp 0.1
+     (<i>the default perfdamp is 0.01</i>)
+  * No groupsize
+  * Will run with CUDA, does not need triton.
+  * Quant completed on a 'Premium GPU' and 'High Memory' Google Colab.
+## Benchmark results
+|<b>Model<b>|<b>C4<b>|<b>WikiText2<b>|<b>PTB<b>|
+|:---:|---|---|---|
+|This Quant|x.xxxxxx|x.xxxxxx|x.xxxxxx|
+|Aela's Quant <a href="https://huggingface.co/Aeala/GPT4-x-AlpacaDente2-30b/resolve/main/4bit.safetensors">here</a>|x.xxxxxx|x.xxxxxx|x.xxxxxx|