hfl
/

File size: 2,488 Bytes
47bc728
 
d45a7b9
 
 
6967e9f
 
47bc728
cc7fe76
 
24cdd9f
33685c6
24cdd9f
cc7fe76
33685c6
 
cc7fe76
 
3fe1443
 
cc7fe76
 
 
 
1093147
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
cc7fe76
 
 
 
 
 
 
 
28fedc3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
---
license: apache-2.0
language:
- zh
- en
tags:
  - moe
---

# Chinese-Mixtral-Instruct-GGUF
<p align="center">
    <a href="https://github.com/ymcui/Chinese-Mixtral"><img src="https://ymcui.com/images/chinese-mixtral-banner.png" width="600"/></a>
</p>

**Chinese Mixtral GitHub repository: https://github.com/ymcui/Chinese-Mixtral**

This repository contains the GGUF-v3 models (llama.cpp compatible) for **Chinese-Mixtral-Instruct** (chat/instruction model).

**Note: When using instruction/chat model, you MUST follow the official prompt template! Example: [chat.sh](https://github.com/ymcui/Chinese-Mixtral/blob/main/scripts/llamacpp/chat.sh)**

## Performance

Metric: PPL, lower is better

| Quant   | Size ↓  | PPL                |
| ------- | ------- | ------------------ |
| IQ1_S   | 9.8 GB  | 9.5782 +/- 0.08909 |
| IQ1_M   | 10.8 GB | 7.4666 +/- 0.06741 |
| IQ2_XXS | 12.3 GB | 6.3923 +/- 0.05674 |
| IQ2_XS  | 13.7 GB | 6.0606 +/- 0.05834 |
| IQ2_S   | 14.1 GB | 4.7617 +/- 0.04177 |
| IQ2_M   | 15.5 GB | 4.5911 +/- 0.04054 |
| Q2_K    | 17.3 GB | 4.8592 +/- 0.04303 |
| IQ3_XXS | 18.3 GB | 4.3557 +/- 0.03846 |
| IQ3_XS  | 19.3 GB | 4.3328 +/- 0.03779 |
| IQ3_S   | 20.4 GB | 4.3138 +/- 0.03785 |
| IQ3_M   | 21.4 GB | 4.3024 +/- 0.03775 |
| Q3_K    | 22.5 GB | 4.4334 +/- 0.03937 |
| IQ4_XS  | 25.1 GB | 4.2324 +/- 0.03757 |
| Q4_0    | 26.4 GB | 4.2688 +/- 0.03787 |
| IQ4_NL  | 26.5 GB | 4.2384 +/- 0.03763 |
| Q4_K    | 28.4 GB | 4.2433 +/- 0.03768 |
| Q5_0    | 32.2 GB | 4.2142 +/- 0.03733 |
| Q5_K    | 33.2 GB | 4.2177 +/- 0.03743 |
| Q6_K    | 38.4 GB | 4.2184 +/- 0.03754 |
| Q8_0    | 49.6 GB | 4.2053 +/- 0.03732 |
| F16     | 93.5 GB | x                  |

Due to the file size limitation, for F16 model, please use `cat` command to concatenate all parts into a single file. **You must concatenate these parts in order.**


## Others

For Hugging Face version, please see: https://huggingface.co/hfl/chinese-mixtral-instruct

Please refer to [https://github.com/ymcui/Chinese-Mixtral/](https://github.com/ymcui/Chinese-Mixtral/) for more details.


## Citation

Please consider cite our paper if you use the resource of this repository.
Paper link: https://arxiv.org/abs/2403.01851
```
@article{chinese-mixtral,
      title={Rethinking LLM Language Adaptation: A Case Study on Chinese Mixtral}, 
      author={Cui, Yiming and Yao, Xin},
      journal={arXiv preprint arXiv:2403.01851},
      url={https://arxiv.org/abs/2403.01851},
      year={2024}
}
```