Upload folder using huggingface_hub
Browse files- .gitattributes +1 -0
- README.md +34 -72
- all_results.json +10 -10
- cantonese_llm_v1.jpg +3 -0
- config.json +1 -1
- eval_results.json +5 -5
- model-00001-of-00031.safetensors +1 -1
- model-00002-of-00031.safetensors +1 -1
- model-00003-of-00031.safetensors +1 -1
- model-00004-of-00031.safetensors +1 -1
- model-00005-of-00031.safetensors +1 -1
- model-00006-of-00031.safetensors +1 -1
- model-00007-of-00031.safetensors +1 -1
- model-00008-of-00031.safetensors +1 -1
- model-00009-of-00031.safetensors +1 -1
- model-00010-of-00031.safetensors +1 -1
- model-00011-of-00031.safetensors +1 -1
- model-00012-of-00031.safetensors +1 -1
- model-00013-of-00031.safetensors +1 -1
- model-00014-of-00031.safetensors +1 -1
- model-00015-of-00031.safetensors +1 -1
- model-00016-of-00031.safetensors +1 -1
- model-00017-of-00031.safetensors +1 -1
- model-00018-of-00031.safetensors +1 -1
- model-00019-of-00031.safetensors +1 -1
- model-00020-of-00031.safetensors +1 -1
- model-00021-of-00031.safetensors +1 -1
- model-00022-of-00031.safetensors +1 -1
- model-00023-of-00031.safetensors +1 -1
- model-00024-of-00031.safetensors +1 -1
- model-00025-of-00031.safetensors +1 -1
- model-00026-of-00031.safetensors +1 -1
- model-00027-of-00031.safetensors +1 -1
- model-00028-of-00031.safetensors +1 -1
- model-00029-of-00031.safetensors +1 -1
- model-00030-of-00031.safetensors +1 -1
- model-00031-of-00031.safetensors +1 -1
- special_tokens_map.json +1 -1
- tokenizer_config.json +1 -1
- train_results.json +6 -6
- trainer_log.jsonl +0 -0
- trainer_state.json +0 -0
- training_args.bin +2 -2
- training_eval_loss.png +0 -0
- training_loss.png +0 -0
.gitattributes
CHANGED
@@ -34,3 +34,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
36 |
tokenizer.json filter=lfs diff=lfs merge=lfs -text
|
|
|
|
34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
36 |
tokenizer.json filter=lfs diff=lfs merge=lfs -text
|
37 |
+
cantonese_llm_v1.jpg filter=lfs diff=lfs merge=lfs -text
|
README.md
CHANGED
@@ -1,89 +1,51 @@
|
|
1 |
---
|
2 |
-
library_name: transformers
|
3 |
license: other
|
4 |
-
|
5 |
tags:
|
6 |
- llama-factory
|
7 |
- full
|
8 |
- generated_from_trainer
|
|
|
9 |
model-index:
|
10 |
- name: CantoneseLLMChat-v1.0-72B
|
11 |
results: []
|
12 |
---
|
13 |
|
14 |
-
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
15 |
-
should probably proofread and complete it, then remove this comment. -->
|
16 |
|
17 |
# CantoneseLLMChat-v1.0-72B
|
18 |
|
19 |
-
|
20 |
-
It achieves the following results on the evaluation set:
|
21 |
-
- Loss: 0.9810
|
22 |
-
|
23 |
-
## Model description
|
24 |
-
|
25 |
-
More information needed
|
26 |
-
|
27 |
-
## Intended uses & limitations
|
28 |
-
|
29 |
-
More information needed
|
30 |
-
|
31 |
-
## Training and evaluation data
|
32 |
|
33 |
-
More information needed
|
34 |
|
35 |
-
|
|
|
36 |
|
37 |
-
|
38 |
-
|
39 |
-
|
40 |
-
|
41 |
-
|
42 |
-
|
43 |
-
|
44 |
-
|
45 |
-
|
46 |
-
|
47 |
-
|
48 |
-
|
49 |
-
|
50 |
-
|
51 |
-
|
52 |
-
|
53 |
-
|
54 |
-
|
55 |
-
|
56 |
-
|
57 |
-
|
58 |
-
|
59 |
-
|
60 |
-
|
61 |
-
|
62 |
-
|
63 |
-
|
64 |
-
|
65 |
-
|
66 |
-
| 0.8446 | 0.4318 | 900 | 0.8920 |
|
67 |
-
| 0.8908 | 0.4797 | 1000 | 0.8980 |
|
68 |
-
| 0.8806 | 0.5277 | 1100 | 0.8870 |
|
69 |
-
| 0.8549 | 0.5757 | 1200 | 0.8887 |
|
70 |
-
| 0.9197 | 0.6237 | 1300 | 0.8914 |
|
71 |
-
| 0.8864 | 0.6716 | 1400 | 0.8827 |
|
72 |
-
| 0.8231 | 0.7196 | 1500 | 0.8758 |
|
73 |
-
| 0.8658 | 0.7676 | 1600 | 0.8723 |
|
74 |
-
| 0.8506 | 0.8155 | 1700 | 0.8722 |
|
75 |
-
| 0.9533 | 0.8635 | 1800 | 0.8710 |
|
76 |
-
| 0.7901 | 0.9115 | 1900 | 0.8655 |
|
77 |
-
| 0.8306 | 0.9595 | 2000 | 0.8639 |
|
78 |
-
| 0.4431 | 1.4392 | 3000 | 0.8874 |
|
79 |
-
| 0.4682 | 1.9189 | 4000 | 0.8693 |
|
80 |
-
| 0.139 | 2.3987 | 5000 | 0.9829 |
|
81 |
-
| 0.146 | 2.8784 | 6000 | 0.9811 |
|
82 |
-
|
83 |
-
|
84 |
-
### Framework versions
|
85 |
-
|
86 |
-
- Transformers 4.46.1
|
87 |
-
- Pytorch 2.4.0+cu121
|
88 |
-
- Datasets 3.1.0
|
89 |
-
- Tokenizers 0.20.3
|
|
|
1 |
---
|
|
|
2 |
license: other
|
3 |
+
library_name: transformers
|
4 |
tags:
|
5 |
- llama-factory
|
6 |
- full
|
7 |
- generated_from_trainer
|
8 |
+
base_model: hon9kon9ize/CantoneseLLM-v1.0-72B-cpt
|
9 |
model-index:
|
10 |
- name: CantoneseLLMChat-v1.0-72B
|
11 |
results: []
|
12 |
---
|
13 |
|
|
|
|
|
14 |
|
15 |
# CantoneseLLMChat-v1.0-72B
|
16 |
|
17 |
+

|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
18 |
|
|
|
19 |
|
20 |
+
Cantonese LLM Chat v1.0 is the first generation Cantonese LLM from hon9kon9ize.
|
21 |
+
Building upon the sucess of [v0.5 preview](https://huggingface.co/hon9kon9ize/CantoneseLLMChat-v0.5), the model excels in Hong Kong related specific knowledge and Cantonese conversation.
|
22 |
|
23 |
+
## Model description
|
24 |
+
Base model obtained via Continuous Pre-Training of [Qwen 2.5 72B](https://huggingface.co/Qwen/Qwen2.5-72B) with 600 millions publicaly available Hong Kong news articles and Cantonese websites.
|
25 |
+
Instructions fine-tuned model trained with a dataset consists of 75,000 instrutions pairs. 45,000 pairs were Cantonese insturctions generated by other LLMs and reviewed by humans.
|
26 |
+
|
27 |
+
The model trained with 16 Nvidia H100 96GB HBM2e GPUs on [Genkai Supercomputer](https://www.cc.kyushu-u.ac.jp/scp/eng/system/Genkai/hardware/).
|
28 |
+
|
29 |
+
## Basic Usage
|
30 |
+
```
|
31 |
+
import torch
|
32 |
+
from transformers import AutoTokenizer, AutoModelForCausalLM
|
33 |
+
model_id = "hon9kon9ize/CantoneseLLMChat-v1.0-72B"
|
34 |
+
tokenizer = AutoTokenizer.from_pretrained(model_id)
|
35 |
+
model = AutoModelForCausalLM.from_pretrained(
|
36 |
+
model_id,
|
37 |
+
torch_dtype=torch.bfloat16,
|
38 |
+
device_map="auto",
|
39 |
+
)
|
40 |
+
def chat(messages, temperature=0.9, max_new_tokens=200):
|
41 |
+
input_ids = tokenizer.apply_chat_template(conversation=messages, tokenize=True, add_generation_prompt=True, return_tensors='pt').to('cuda:0')
|
42 |
+
output_ids = model.generate(input_ids, max_new_tokens=max_new_tokens, temperature=temperature)
|
43 |
+
response = tokenizer.decode(output_ids[0][input_ids.shape[1]:], skip_special_tokens=False)
|
44 |
+
return response
|
45 |
+
prompt = "邊個係香港特首?"
|
46 |
+
messages = [
|
47 |
+
{"role": "system", "content": "you are a helpful assistant."},
|
48 |
+
{"role": "user", "content": prompt}
|
49 |
+
]
|
50 |
+
print(chat(messages)) # 香港特別行政區行政長官係李家超。<|im_end|>
|
51 |
+
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
all_results.json
CHANGED
@@ -1,12 +1,12 @@
|
|
1 |
{
|
2 |
-
"epoch": 2.
|
3 |
-
"eval_loss": 0.
|
4 |
-
"eval_runtime":
|
5 |
-
"eval_samples_per_second": 9.
|
6 |
-
"eval_steps_per_second": 0.
|
7 |
-
"total_flos":
|
8 |
-
"train_loss": 0.
|
9 |
-
"train_runtime":
|
10 |
-
"train_samples_per_second":
|
11 |
-
"train_steps_per_second": 0.
|
12 |
}
|
|
|
1 |
{
|
2 |
+
"epoch": 2.9964020148716717,
|
3 |
+
"eval_loss": 0.9444097280502319,
|
4 |
+
"eval_runtime": 742.9655,
|
5 |
+
"eval_samples_per_second": 9.974,
|
6 |
+
"eval_steps_per_second": 0.625,
|
7 |
+
"total_flos": 5.073775214995702e+17,
|
8 |
+
"train_loss": 0.5172660127031643,
|
9 |
+
"train_runtime": 62603.1223,
|
10 |
+
"train_samples_per_second": 3.196,
|
11 |
+
"train_steps_per_second": 0.033
|
12 |
}
|
cantonese_llm_v1.jpg
ADDED
![]() |
Git LFS Details
|
config.json
CHANGED
@@ -1,5 +1,5 @@
|
|
1 |
{
|
2 |
-
"_name_or_path": "
|
3 |
"architectures": [
|
4 |
"Qwen2ForCausalLM"
|
5 |
],
|
|
|
1 |
{
|
2 |
+
"_name_or_path": "/home/pj24001684/ku40000295/jc/models/Qwen72B-cpt",
|
3 |
"architectures": [
|
4 |
"Qwen2ForCausalLM"
|
5 |
],
|
eval_results.json
CHANGED
@@ -1,7 +1,7 @@
|
|
1 |
{
|
2 |
-
"epoch": 2.
|
3 |
-
"eval_loss": 0.
|
4 |
-
"eval_runtime":
|
5 |
-
"eval_samples_per_second": 9.
|
6 |
-
"eval_steps_per_second": 0.
|
7 |
}
|
|
|
1 |
{
|
2 |
+
"epoch": 2.9964020148716717,
|
3 |
+
"eval_loss": 0.9444097280502319,
|
4 |
+
"eval_runtime": 742.9655,
|
5 |
+
"eval_samples_per_second": 9.974,
|
6 |
+
"eval_steps_per_second": 0.625
|
7 |
}
|
model-00001-of-00031.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 4548798728
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:bb8759ff915cea07e7a3c69f183c5e25af915b1f2cace2be87a965bb320484ef
|
3 |
size 4548798728
|
model-00002-of-00031.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 4964101384
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:79c2826c011861b78059a360d7ec39a4a42d12ea86f0251f5dc39e8e70638d02
|
3 |
size 4964101384
|
model-00003-of-00031.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 4781637328
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:c6d5f9b87f61e06053265562da0d4750a983ea1112fc3f697236b1f1a3abedc2
|
3 |
size 4781637328
|
model-00004-of-00031.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 4781670320
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:425587fd272578780ac7a4105bb1a15bd18d2821269bc5ad8ebc79f0fe3714d5
|
3 |
size 4781670320
|
model-00005-of-00031.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 4781670360
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:d048ab9320bacaac332209e85bcf63a7ba46c0b0c1ac6e8de8d80e0083dd111a
|
3 |
size 4781670360
|
model-00006-of-00031.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 4964101416
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:7f0fcaa1be402eb8801c35da7fb371b8b0a2cfaa5928045bced5863b8ded20e8
|
3 |
size 4964101416
|
model-00007-of-00031.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 4781637360
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:d9094ab00a315f7e1f5108034351ff89433a59cc118929b754c7d81620ea37ae
|
3 |
size 4781637360
|
model-00008-of-00031.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 4781670360
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:7cc520428b6f804914aa9c7e79ed1cd8dd5fd3019e1496e5c054da7e26c21e59
|
3 |
size 4781670360
|
model-00009-of-00031.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 4781670360
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:a599f1cd0af31de43fa81b709d65258c215efbd60e96d064af161b0d8fab53ed
|
3 |
size 4781670360
|
model-00010-of-00031.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 4964101416
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:606c6b4eabb5abb3564f0cb07f829cef93f8326c7287d11ca18ccfaa6e825142
|
3 |
size 4964101416
|
model-00011-of-00031.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 4781637360
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:9d6b7e39d87b02214e37219779f3763de48a61b6b659226fe664817dcf29054e
|
3 |
size 4781637360
|
model-00012-of-00031.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 4781670360
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:c7432b14dbac12bce118ee55feeb3481c877055ad201a6e97f43881cc76cc78e
|
3 |
size 4781670360
|
model-00013-of-00031.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 4781670360
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:1841a6cf419ff5ceeb289e2195ab0e66c6e7c2b43539039ccd9851c956486beb
|
3 |
size 4781670360
|
model-00014-of-00031.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 4964101416
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:915c5be1782cb0d4673a0971d7e83b3f47d0a94b7063363c7aae5fdc37cfbce2
|
3 |
size 4964101416
|
model-00015-of-00031.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 4781637360
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:de2d9e8929c7347e73017be77865838f0790853427cfa42d7ee4d09dc7746fb6
|
3 |
size 4781637360
|
model-00016-of-00031.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 4781670360
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:50d9e2d3ea53a340c1eacf42b97eb447c69b7f6dcc83a972bb6479fa75277c79
|
3 |
size 4781670360
|
model-00017-of-00031.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 4781670360
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:afdb3e03f0c73ca1919f442cf7d783da110d4db304fdb60ebe55fdfd8e9b0a5e
|
3 |
size 4781670360
|
model-00018-of-00031.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 4964101416
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:187c8e1ec81c3ee9d60f64818e83935e7a721170504447ad94ed74b6c79736d1
|
3 |
size 4964101416
|
model-00019-of-00031.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 4781637360
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:05a1ccb41e0ab9c98507e1c1298dca4f906d11ccf7d210fbe49ceaa18c1b3f7c
|
3 |
size 4781637360
|
model-00020-of-00031.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 4781670360
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:e15ebf66f217cc4759d306fd2bc2f0d519ddb1ca74499b4ed2eda26fc85fba87
|
3 |
size 4781670360
|
model-00021-of-00031.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 4781670360
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:7782af60585fc270846fb17a801aac4ae34d8eb452748db67e17fdf1c43a6f53
|
3 |
size 4781670360
|
model-00022-of-00031.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 4964101416
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:a3049cb31bfca83642bffd42722ff72cc049a9ca49bd52e51b3c6e5a891d57c9
|
3 |
size 4964101416
|
model-00023-of-00031.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 4781637360
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:746a39ded267625e203a3885f8aa22450517acfd5115ed4f76c4f3f449b61177
|
3 |
size 4781637360
|
model-00024-of-00031.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 4781670360
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:cc226df3b9f41345a076b84c2efb0dd70a94e1eab62d7c239d1cbe29cae9949d
|
3 |
size 4781670360
|
model-00025-of-00031.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 4781670360
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:a4b37553eb10075cb26c12deaf7e7c2a8e078fd5fbf8b1459aef948df4f2997b
|
3 |
size 4781670360
|
model-00026-of-00031.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 4964101416
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:1d77373b7e74461abdf6940267341b8c5520e425ffc08e73e3c66120caf054d1
|
3 |
size 4964101416
|
model-00027-of-00031.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 4781637360
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:f2bdf795798cf1aab4c3e5ddac541dae5761f6510ccdebe79237b24e5f69d441
|
3 |
size 4781637360
|
model-00028-of-00031.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 4781670360
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:2db8c732238a30a17a28613fe73b80976061bd5db90755a3ff28b6e33cb3248e
|
3 |
size 4781670360
|
model-00029-of-00031.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 4781670360
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:663249faff6efe37e051a9aaccc71261b2da33cbb24ef206521ff5b4e0c58374
|
3 |
size 4781670360
|
model-00030-of-00031.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 3208747032
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:656bb4379b28844acf96b3f66f35c0eb05bc260e9bf5f7806ef5f86153ec6b24
|
3 |
size 3208747032
|
model-00031-of-00031.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 2491416704
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:1c46c7193ab62a216807498d445f89b75dbbfeb91c68cd69809b0d030ab82830
|
3 |
size 2491416704
|
special_tokens_map.json
CHANGED
@@ -15,7 +15,7 @@
|
|
15 |
"<|video_pad|>"
|
16 |
],
|
17 |
"eos_token": {
|
18 |
-
"content": "<|
|
19 |
"lstrip": false,
|
20 |
"normalized": false,
|
21 |
"rstrip": false,
|
|
|
15 |
"<|video_pad|>"
|
16 |
],
|
17 |
"eos_token": {
|
18 |
+
"content": "<|endoftext|>",
|
19 |
"lstrip": false,
|
20 |
"normalized": false,
|
21 |
"rstrip": false,
|
tokenizer_config.json
CHANGED
@@ -197,7 +197,7 @@
|
|
197 |
"bos_token": null,
|
198 |
"chat_template": "{%- if tools %}\n {{- '<|im_start|>system\\n' }}\n {%- if messages[0]['role'] == 'system' %}\n {{- messages[0]['content'] }}\n {%- else %}\n {{- 'You are CantoneseLLM, created by hon9kon9ize. You are a helpful assistant.' }}\n {%- endif %}\n {{- \"\\n\\n# Tools\\n\\nYou may call one or more functions to assist with the user query.\\n\\nYou are provided with function signatures within <tools></tools> XML tags:\\n<tools>\" }}\n {%- for tool in tools %}\n {{- \"\\n\" }}\n {{- tool | tojson }}\n {%- endfor %}\n {{- \"\\n</tools>\\n\\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\\n<tool_call>\\n{\\\"name\\\": <function-name>, \\\"arguments\\\": <args-json-object>}\\n</tool_call><|im_end|>\\n\" }}\n{%- else %}\n {%- if messages[0]['role'] == 'system' %}\n {{- '<|im_start|>system\\n' + messages[0]['content'] + '<|im_end|>\\n' }}\n {%- else %}\n {{- '<|im_start|>system\\nYou are CantoneseLLM, created by hon9kon9ize. You are a helpful assistant.<|im_end|>\\n' }}\n {%- endif %}\n{%- endif %}\n{%- for message in messages %}\n {%- if (message.role == \"user\") or (message.role == \"system\" and not loop.first) or (message.role == \"assistant\" and not message.tool_calls) %}\n {{- '<|im_start|>' + message.role + '\\n' + message.content + '<|im_end|>' + '\\n' }}\n {%- elif message.role == \"assistant\" %}\n {{- '<|im_start|>' + message.role }}\n {%- if message.content %}\n {{- '\\n' + message.content }}\n {%- endif %}\n {%- for tool_call in message.tool_calls %}\n {%- if tool_call.function is defined %}\n {%- set tool_call = tool_call.function %}\n {%- endif %}\n {{- '\\n<tool_call>\\n{\"name\": \"' }}\n {{- tool_call.name }}\n {{- '\", \"arguments\": ' }}\n {{- tool_call.arguments | tojson }}\n {{- '}\\n</tool_call>' }}\n {%- endfor %}\n {{- '<|im_end|>\\n' }}\n {%- elif message.role == \"tool\" %}\n {%- if (loop.index0 == 0) or (messages[loop.index0 - 1].role != \"tool\") %}\n {{- '<|im_start|>user' }}\n {%- endif %}\n {{- '\\n<tool_response>\\n' }}\n {{- message.content }}\n {{- '\\n</tool_response>' }}\n {%- if loop.last or (messages[loop.index0 + 1].role != \"tool\") %}\n {{- '<|im_end|>\\n' }}\n {%- endif %}\n {%- endif %}\n{%- endfor %}\n{%- if add_generation_prompt %}\n {{- '<|im_start|>assistant\\n' }}\n{%- endif %}\n",
|
199 |
"clean_up_tokenization_spaces": false,
|
200 |
-
"eos_token": "<|
|
201 |
"errors": "replace",
|
202 |
"model_max_length": 131072,
|
203 |
"pad_token": "<|endoftext|>",
|
|
|
197 |
"bos_token": null,
|
198 |
"chat_template": "{%- if tools %}\n {{- '<|im_start|>system\\n' }}\n {%- if messages[0]['role'] == 'system' %}\n {{- messages[0]['content'] }}\n {%- else %}\n {{- 'You are CantoneseLLM, created by hon9kon9ize. You are a helpful assistant.' }}\n {%- endif %}\n {{- \"\\n\\n# Tools\\n\\nYou may call one or more functions to assist with the user query.\\n\\nYou are provided with function signatures within <tools></tools> XML tags:\\n<tools>\" }}\n {%- for tool in tools %}\n {{- \"\\n\" }}\n {{- tool | tojson }}\n {%- endfor %}\n {{- \"\\n</tools>\\n\\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\\n<tool_call>\\n{\\\"name\\\": <function-name>, \\\"arguments\\\": <args-json-object>}\\n</tool_call><|im_end|>\\n\" }}\n{%- else %}\n {%- if messages[0]['role'] == 'system' %}\n {{- '<|im_start|>system\\n' + messages[0]['content'] + '<|im_end|>\\n' }}\n {%- else %}\n {{- '<|im_start|>system\\nYou are CantoneseLLM, created by hon9kon9ize. You are a helpful assistant.<|im_end|>\\n' }}\n {%- endif %}\n{%- endif %}\n{%- for message in messages %}\n {%- if (message.role == \"user\") or (message.role == \"system\" and not loop.first) or (message.role == \"assistant\" and not message.tool_calls) %}\n {{- '<|im_start|>' + message.role + '\\n' + message.content + '<|im_end|>' + '\\n' }}\n {%- elif message.role == \"assistant\" %}\n {{- '<|im_start|>' + message.role }}\n {%- if message.content %}\n {{- '\\n' + message.content }}\n {%- endif %}\n {%- for tool_call in message.tool_calls %}\n {%- if tool_call.function is defined %}\n {%- set tool_call = tool_call.function %}\n {%- endif %}\n {{- '\\n<tool_call>\\n{\"name\": \"' }}\n {{- tool_call.name }}\n {{- '\", \"arguments\": ' }}\n {{- tool_call.arguments | tojson }}\n {{- '}\\n</tool_call>' }}\n {%- endfor %}\n {{- '<|im_end|>\\n' }}\n {%- elif message.role == \"tool\" %}\n {%- if (loop.index0 == 0) or (messages[loop.index0 - 1].role != \"tool\") %}\n {{- '<|im_start|>user' }}\n {%- endif %}\n {{- '\\n<tool_response>\\n' }}\n {{- message.content }}\n {{- '\\n</tool_response>' }}\n {%- if loop.last or (messages[loop.index0 + 1].role != \"tool\") %}\n {{- '<|im_end|>\\n' }}\n {%- endif %}\n {%- endif %}\n{%- endfor %}\n{%- if add_generation_prompt %}\n {{- '<|im_start|>assistant\\n' }}\n{%- endif %}\n",
|
199 |
"clean_up_tokenization_spaces": false,
|
200 |
+
"eos_token": "<|endoftext|>",
|
201 |
"errors": "replace",
|
202 |
"model_max_length": 131072,
|
203 |
"pad_token": "<|endoftext|>",
|
train_results.json
CHANGED
@@ -1,8 +1,8 @@
|
|
1 |
{
|
2 |
-
"epoch": 2.
|
3 |
-
"total_flos":
|
4 |
-
"train_loss": 0.
|
5 |
-
"train_runtime":
|
6 |
-
"train_samples_per_second":
|
7 |
-
"train_steps_per_second": 0.
|
8 |
}
|
|
|
1 |
{
|
2 |
+
"epoch": 2.9964020148716717,
|
3 |
+
"total_flos": 5.073775214995702e+17,
|
4 |
+
"train_loss": 0.5172660127031643,
|
5 |
+
"train_runtime": 62603.1223,
|
6 |
+
"train_samples_per_second": 3.196,
|
7 |
+
"train_steps_per_second": 0.033
|
8 |
}
|
trainer_log.jsonl
CHANGED
The diff for this file is too large to render.
See raw diff
|
|
trainer_state.json
CHANGED
The diff for this file is too large to render.
See raw diff
|
|
training_args.bin
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:699d52cf06aa151f43cc82bf48ddba6f12cc6270440e0acbb8fbc0154ff38597
|
3 |
+
size 7288
|
training_eval_loss.png
CHANGED
![]() |
![]() |
training_loss.png
CHANGED
![]() |
![]() |