RichardErkhov commited on
Commit
5a93530
·
verified ·
1 Parent(s): 3757baf

uploaded readme

Browse files
Files changed (1) hide show
  1. README.md +93 -0
README.md ADDED
@@ -0,0 +1,93 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Quantization made by Richard Erkhov.
2
+
3
+ [Github](https://github.com/RichardErkhov)
4
+
5
+ [Discord](https://discord.gg/pvy7H8DZMG)
6
+
7
+ [Request more models](https://github.com/RichardErkhov/quant_request)
8
+
9
+
10
+ typhoon-7b - bnb 4bits
11
+ - Model creator: https://huggingface.co/scb10x/
12
+ - Original model: https://huggingface.co/scb10x/typhoon-7b/
13
+
14
+
15
+
16
+
17
+ Original model description:
18
+
19
+ ---
20
+ license: apache-2.0
21
+ language:
22
+ - th
23
+ library_name: transformers
24
+ pipeline_tag: text-generation
25
+ tags:
26
+ - pretrained
27
+ ---
28
+ # Typhoon-7B: Thai Large Language Model (Pretrained)
29
+
30
+ **Typhoon-7B** is a *pretrained* Thai 🇹🇭 large language model with 7 billion parameters, and it is based on Mistral-7B.
31
+
32
+ **Typhoon-7B** outperforms all open-source Thai language models at the time of writing as evaluated on Thai examination benchmarks, and its instruction-tuned variant achieves the best results in instruction-following tasks. Also, its performance in Thai is on par with GPT-3.5 while being 2.62 times more efficient in tokenizing Thai text.
33
+
34
+ **This is not an instruction-tuned model** - It may not be able to follow human instructions without using one/few-shot learning or instruction fine-tuning. The model does not have any moderation mechanisms, and may generate harmful or inappropriate responses.
35
+
36
+ The Instruct model (chat-model) will be released soon. The beta version register is open at https://opentyphoon.ai/ or follow us for future model release https://twitter.com/opentyphoon.
37
+
38
+ <div align="center">
39
+ <img src="https://storage.googleapis.com/scb10x-ai-lab-public/assets/typhoon_benchmark.png" alt="Typhoon benchmark" width="100%" style="margin-left:'auto' margin-right:'auto' display:'block'"/>
40
+ </div>
41
+
42
+ For full details of this model, please read our [paper](https://arxiv.org/abs/2312.13951).
43
+
44
+
45
+ ## Model Description
46
+ - **Model type**: A 7B pretrained decoder-only model
47
+ - **Requirement**: transformers 4.34.0 or newer.
48
+ - **Primary Language(s)**: Thai 🇹🇭 and English 🇬🇧
49
+ - **License**: Apache-2.0 (Commercial)
50
+
51
+ ## Performance on Thai Benchmark
52
+
53
+ | **Model** | **ONET** | **IC** | **TGAT** | **TPAT-1** | **A-Level** |
54
+ |---------------------|----------|--------|----------|------------|-------------|
55
+ | Typhoon-7B | 0.379 | 0.393 | 0.700 | 0.414 | 0.324 |
56
+ | SeaLLM-7B | 0.342 | 0.256 | 0.589 | 0.336 | 0.305 |
57
+ | OpenThaiGPT-beta-7B | 0.180 | 0.278 | 0.411 | 0.319 | 0.243 |
58
+ | WangChanGLM | 0.192 | 0.271 | 0.167 | 0.172 | 0.175 |
59
+ | SEA-LION-7B | 0.179 | 0.290 | 0.244 | 0.198 | 0.175 |
60
+ | Avg. Human | 0.318 | - | 0.472 | 0.406 | - |
61
+
62
+ ## Intended Uses & Limitations
63
+
64
+ This model is a pretrained base model. Thus, it may not be able to follow human instructions without using one/few-shot learning or instruction fine-tuning. The model does not have any moderation mechanisms, and may generate harmful or inappropriate responses.
65
+
66
+ ## Follow us
67
+
68
+ https://twitter.com/opentyphoon
69
+
70
+ ## Support / Ask any question
71
+
72
+ https://discord.gg/CqyBscMFpg
73
+
74
+ ## SCB10X AI Team
75
+
76
+ - Kunat Pipatanakul, Phatrasek Jirabovonvisut, Potsawee Manakul, Sittipong Sripaisarnmongkol, Ruangsak Patomwong, Pathomporn Chokchainant, Kasima Tharnpipitchai
77
+ - If you find Typhoon-7B useful for your work, please cite it using:
78
+ ```
79
+ @article{pipatanakul2023typhoon,
80
+ title={Typhoon: Thai Large Language Models},
81
+ author={Kunat Pipatanakul and Phatrasek Jirabovonvisut and Potsawee Manakul and Sittipong Sripaisarnmongkol and Ruangsak Patomwong and Pathomporn Chokchainant and Kasima Tharnpipitchai},
82
+ year={2023},
83
+ journal={arXiv preprint arXiv:2312.13951},
84
+ url={https://arxiv.org/abs/2312.13951}
85
+ }
86
+ ```
87
+
88
+ ## Contact Us
89
+
90
+ - General & Collaboration: [email protected], [email protected]
91
+ - Technical: [email protected]
92
+
93
+