hfl
/

chinese-alpaca-2-13b-16k-gguf

Inference Endpoints

Model card Files Files and versions Community

Edit model card

Chinese-Alpaca-2-13B-16K-GGUF

This repository contains the GGUF-v3 models (llama.cpp compatible) for Chinese-Alpaca-2-13B-16K.

Performance

Metric: PPL, lower is better

Quant	original	imatrix (`-im`)
Q2_K	12.7790 +/- 0.17943	13.8057 +/- 0.19614
Q3_K	10.0834 +/- 0.14063	9.6355 +/- 0.13483
Q4_0	9.7072 +/- 0.13563	-
Q4_K	9.2864 +/- 0.13001	9.2097 +/- 0.12874
Q5_0	9.2062 +/- 0.12846	-
Q5_K	9.0912 +/- 0.12705	9.0701 +/- 0.12668
Q6_K	9.0799 +/- 0.12681	9.0558 +/- 0.12653
Q8_0	9.0200 +/- 0.12616	-
F16	9.0142 +/- 0.12603	-

The model with -im suffix is generated with important matrix, which has generally better performance (not always though).

Others

For Hugging Face version, please see: https://huggingface.co/hfl/chinese-alpaca-2-13b-16k

Please refer to https://github.com/ymcui/Chinese-LLaMA-Alpaca-2/ for more details.

Downloads last month: 191

GGUF

Model size

13.3B params

Architecture

llama

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

Inference API

Unable to determine this model's library. Check the docs .