hfl
/

chinese-llama-2-7b-gguf

Text Generation

Inference Endpoints

Model card Files Files and versions Community

chinese-llama-2-7b-gguf / README.md

hfl-rc's picture

Update README.md

d03f38a verified 10 months ago

|

1.04 kB

	---
	license: apache-2.0
	language:
	- zh
	- en
	---

	# Chinese-LLaMA-2-7B-GGUF

	This repository contains the GGUF-v3 models (llama.cpp compatible) for Chinese-LLaMA-2-7B.


	## Performance

	Metric: PPL, lower is better

	\| Quant \| original \| imatrix (`-im`) \|
	\|-----\|------\|------\|
	\| Q2_K \| 15.1160 +/- 0.30469 \| 12.7682 +/- 0.26022 \|
	\| Q3_K \| 9.9588 +/- 0.20549 \| 9.8508 +/- 0.20484 \|
	\| Q4_0 \| 9.8085 +/- 0.20350 \| - \|
	\| Q4_K \| 9.5802 +/- 0.20015 \| 9.6327 +/- 0.20219 \|
	\| Q5_0 \| 9.4783 +/- 0.19622 \| - \|
	\| Q5_K \| 9.5132 +/- 0.19989 \| 9.4447 +/- 0.19772 \|
	\| Q6_K \| 9.4640 +/- 0.19909 \| 9.4507 +/- 0.19849 \|
	\| Q8_0 \| 9.4659 +/- 0.19927 \| - \|
	\| F16 \| 9.4627 +/- 0.19921 \| - \|

	The model with `-im` suffix is generated with important matrix, which has generally better performance (not always though).


	## Others


	For Hugging Face version, please see: https://huggingface.co/hfl/chinese-llama-2-7b

	Please refer to [https://github.com/ymcui/Chinese-LLaMA-Alpaca-2/](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2/) for more details.