README.md · R136a1/MythoMax-L2-13B-exl2 at abdfc216d5ac44d6b14757a405c005d121c9e01d

metadata

license: other
language:
  - en

Other quantized models are available from TheBloke: GGML - GPTQ - GGUF - AWQ

Model details

Branch	Bits	Perplexity	Desc
main	5	6.1018	Up to 6144 context size on T4 GPU
6bit	6	6.1182	4096 context size (tokens) on T4 GPU
-	7	6.1056	2048 max context size for T4 GPU
-	8	6.1027	Just, why?

I'll upload the 7 and 8 bits quant if someone request it. (Idk y the 5 bits quant preplexity is lower than higher bits quant, need some test)

Alpaca format:

### Instruction:


### Response: