indiejoseph commited on
Commit
6abfbf6
·
verified ·
1 Parent(s): cb6c7f7

Upload folder using huggingface_hub

Browse files
Files changed (45) hide show
  1. .gitattributes +1 -0
  2. README.md +34 -72
  3. all_results.json +10 -10
  4. cantonese_llm_v1.jpg +3 -0
  5. config.json +1 -1
  6. eval_results.json +5 -5
  7. model-00001-of-00031.safetensors +1 -1
  8. model-00002-of-00031.safetensors +1 -1
  9. model-00003-of-00031.safetensors +1 -1
  10. model-00004-of-00031.safetensors +1 -1
  11. model-00005-of-00031.safetensors +1 -1
  12. model-00006-of-00031.safetensors +1 -1
  13. model-00007-of-00031.safetensors +1 -1
  14. model-00008-of-00031.safetensors +1 -1
  15. model-00009-of-00031.safetensors +1 -1
  16. model-00010-of-00031.safetensors +1 -1
  17. model-00011-of-00031.safetensors +1 -1
  18. model-00012-of-00031.safetensors +1 -1
  19. model-00013-of-00031.safetensors +1 -1
  20. model-00014-of-00031.safetensors +1 -1
  21. model-00015-of-00031.safetensors +1 -1
  22. model-00016-of-00031.safetensors +1 -1
  23. model-00017-of-00031.safetensors +1 -1
  24. model-00018-of-00031.safetensors +1 -1
  25. model-00019-of-00031.safetensors +1 -1
  26. model-00020-of-00031.safetensors +1 -1
  27. model-00021-of-00031.safetensors +1 -1
  28. model-00022-of-00031.safetensors +1 -1
  29. model-00023-of-00031.safetensors +1 -1
  30. model-00024-of-00031.safetensors +1 -1
  31. model-00025-of-00031.safetensors +1 -1
  32. model-00026-of-00031.safetensors +1 -1
  33. model-00027-of-00031.safetensors +1 -1
  34. model-00028-of-00031.safetensors +1 -1
  35. model-00029-of-00031.safetensors +1 -1
  36. model-00030-of-00031.safetensors +1 -1
  37. model-00031-of-00031.safetensors +1 -1
  38. special_tokens_map.json +1 -1
  39. tokenizer_config.json +1 -1
  40. train_results.json +6 -6
  41. trainer_log.jsonl +0 -0
  42. trainer_state.json +0 -0
  43. training_args.bin +2 -2
  44. training_eval_loss.png +0 -0
  45. training_loss.png +0 -0
.gitattributes CHANGED
@@ -34,3 +34,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
  tokenizer.json filter=lfs diff=lfs merge=lfs -text
 
 
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
  tokenizer.json filter=lfs diff=lfs merge=lfs -text
37
+ cantonese_llm_v1.jpg filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -1,89 +1,51 @@
1
  ---
2
- library_name: transformers
3
  license: other
4
- base_model: hon9kon9ize/CantoneseLLM-v1.0-72B
5
  tags:
6
  - llama-factory
7
  - full
8
  - generated_from_trainer
 
9
  model-index:
10
  - name: CantoneseLLMChat-v1.0-72B
11
  results: []
12
  ---
13
 
14
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
15
- should probably proofread and complete it, then remove this comment. -->
16
 
17
  # CantoneseLLMChat-v1.0-72B
18
 
19
- This model is a fine-tuned version of [hon9kon9ize/CantoneseLLM-v1.0-72B](https://huggingface.co/hon9kon9ize/CantoneseLLM-v1.0-72B) on the sft_v1 dataset.
20
- It achieves the following results on the evaluation set:
21
- - Loss: 0.9810
22
-
23
- ## Model description
24
-
25
- More information needed
26
-
27
- ## Intended uses & limitations
28
-
29
- More information needed
30
-
31
- ## Training and evaluation data
32
 
33
- More information needed
34
 
35
- ## Training procedure
 
36
 
37
- ### Training hyperparameters
38
-
39
- The following hyperparameters were used during training:
40
- - learning_rate: 1e-05
41
- - train_batch_size: 1
42
- - eval_batch_size: 1
43
- - seed: 42
44
- - distributed_type: multi-GPU
45
- - num_devices: 16
46
- - gradient_accumulation_steps: 2
47
- - total_train_batch_size: 32
48
- - total_eval_batch_size: 16
49
- - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
50
- - lr_scheduler_type: cosine
51
- - lr_scheduler_warmup_ratio: 0.1
52
- - num_epochs: 3.0
53
-
54
- ### Training results
55
-
56
- | Training Loss | Epoch | Step | Validation Loss |
57
- |:-------------:|:------:|:----:|:---------------:|
58
- | 1.0194 | 0.0480 | 100 | 1.0800 |
59
- | 0.8006 | 0.0959 | 200 | 0.8958 |
60
- | 0.8183 | 0.1439 | 300 | 0.8850 |
61
- | 0.7918 | 0.1919 | 400 | 0.8789 |
62
- | 0.8661 | 0.2399 | 500 | 0.8784 |
63
- | 0.9071 | 0.2878 | 600 | 0.8993 |
64
- | 0.8955 | 0.3358 | 700 | 0.8928 |
65
- | 0.89 | 0.3838 | 800 | 0.8971 |
66
- | 0.8446 | 0.4318 | 900 | 0.8920 |
67
- | 0.8908 | 0.4797 | 1000 | 0.8980 |
68
- | 0.8806 | 0.5277 | 1100 | 0.8870 |
69
- | 0.8549 | 0.5757 | 1200 | 0.8887 |
70
- | 0.9197 | 0.6237 | 1300 | 0.8914 |
71
- | 0.8864 | 0.6716 | 1400 | 0.8827 |
72
- | 0.8231 | 0.7196 | 1500 | 0.8758 |
73
- | 0.8658 | 0.7676 | 1600 | 0.8723 |
74
- | 0.8506 | 0.8155 | 1700 | 0.8722 |
75
- | 0.9533 | 0.8635 | 1800 | 0.8710 |
76
- | 0.7901 | 0.9115 | 1900 | 0.8655 |
77
- | 0.8306 | 0.9595 | 2000 | 0.8639 |
78
- | 0.4431 | 1.4392 | 3000 | 0.8874 |
79
- | 0.4682 | 1.9189 | 4000 | 0.8693 |
80
- | 0.139 | 2.3987 | 5000 | 0.9829 |
81
- | 0.146 | 2.8784 | 6000 | 0.9811 |
82
-
83
-
84
- ### Framework versions
85
-
86
- - Transformers 4.46.1
87
- - Pytorch 2.4.0+cu121
88
- - Datasets 3.1.0
89
- - Tokenizers 0.20.3
 
1
  ---
 
2
  license: other
3
+ library_name: transformers
4
  tags:
5
  - llama-factory
6
  - full
7
  - generated_from_trainer
8
+ base_model: hon9kon9ize/CantoneseLLM-v1.0-72B-cpt
9
  model-index:
10
  - name: CantoneseLLMChat-v1.0-72B
11
  results: []
12
  ---
13
 
 
 
14
 
15
  # CantoneseLLMChat-v1.0-72B
16
 
17
+ ![front_image](cantonese_llm_v1.jpg)
 
 
 
 
 
 
 
 
 
 
 
 
18
 
 
19
 
20
+ Cantonese LLM Chat v1.0 is the first generation Cantonese LLM from hon9kon9ize.
21
+ Building upon the sucess of [v0.5 preview](https://huggingface.co/hon9kon9ize/CantoneseLLMChat-v0.5), the model excels in Hong Kong related specific knowledge and Cantonese conversation.
22
 
23
+ ## Model description
24
+ Base model obtained via Continuous Pre-Training of [Qwen 2.5 72B](https://huggingface.co/Qwen/Qwen2.5-72B) with 600 millions publicaly available Hong Kong news articles and Cantonese websites.
25
+ Instructions fine-tuned model trained with a dataset consists of 75,000 instrutions pairs. 45,000 pairs were Cantonese insturctions generated by other LLMs and reviewed by humans.
26
+
27
+ The model trained with 16 Nvidia H100 96GB HBM2e GPUs on [Genkai Supercomputer](https://www.cc.kyushu-u.ac.jp/scp/eng/system/Genkai/hardware/).
28
+
29
+ ## Basic Usage
30
+ ```
31
+ import torch
32
+ from transformers import AutoTokenizer, AutoModelForCausalLM
33
+ model_id = "hon9kon9ize/CantoneseLLMChat-v1.0-72B"
34
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
35
+ model = AutoModelForCausalLM.from_pretrained(
36
+ model_id,
37
+ torch_dtype=torch.bfloat16,
38
+ device_map="auto",
39
+ )
40
+ def chat(messages, temperature=0.9, max_new_tokens=200):
41
+ input_ids = tokenizer.apply_chat_template(conversation=messages, tokenize=True, add_generation_prompt=True, return_tensors='pt').to('cuda:0')
42
+ output_ids = model.generate(input_ids, max_new_tokens=max_new_tokens, temperature=temperature)
43
+ response = tokenizer.decode(output_ids[0][input_ids.shape[1]:], skip_special_tokens=False)
44
+ return response
45
+ prompt = "邊個係香港特首?"
46
+ messages = [
47
+ {"role": "system", "content": "you are a helpful assistant."},
48
+ {"role": "user", "content": prompt}
49
+ ]
50
+ print(chat(messages)) # 香港特別行政區行政長官係李家超。<|im_end|>
51
+ ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
all_results.json CHANGED
@@ -1,12 +1,12 @@
1
  {
2
- "epoch": 2.9992804029743345,
3
- "eval_loss": 0.9810025691986084,
4
- "eval_runtime": 744.7507,
5
- "eval_samples_per_second": 9.95,
6
- "eval_steps_per_second": 0.623,
7
- "total_flos": 3.055078882714583e+17,
8
- "train_loss": 0.20201521048409316,
9
- "train_runtime": 48283.8245,
10
- "train_samples_per_second": 4.144,
11
- "train_steps_per_second": 0.129
12
  }
 
1
  {
2
+ "epoch": 2.9964020148716717,
3
+ "eval_loss": 0.9444097280502319,
4
+ "eval_runtime": 742.9655,
5
+ "eval_samples_per_second": 9.974,
6
+ "eval_steps_per_second": 0.625,
7
+ "total_flos": 5.073775214995702e+17,
8
+ "train_loss": 0.5172660127031643,
9
+ "train_runtime": 62603.1223,
10
+ "train_samples_per_second": 3.196,
11
+ "train_steps_per_second": 0.033
12
  }
cantonese_llm_v1.jpg ADDED

Git LFS Details

  • SHA256: 3e16cb6d7cefdfe983cbc30e04e44ba3da33d51ba0ab8b575c47bf7e9b113b92
  • Pointer size: 131 Bytes
  • Size of remote file: 523 kB
config.json CHANGED
@@ -1,5 +1,5 @@
1
  {
2
- "_name_or_path": "hon9kon9ize/CantoneseLLM-v1.0-72B",
3
  "architectures": [
4
  "Qwen2ForCausalLM"
5
  ],
 
1
  {
2
+ "_name_or_path": "/home/pj24001684/ku40000295/jc/models/Qwen72B-cpt",
3
  "architectures": [
4
  "Qwen2ForCausalLM"
5
  ],
eval_results.json CHANGED
@@ -1,7 +1,7 @@
1
  {
2
- "epoch": 2.9992804029743345,
3
- "eval_loss": 0.9810025691986084,
4
- "eval_runtime": 744.7507,
5
- "eval_samples_per_second": 9.95,
6
- "eval_steps_per_second": 0.623
7
  }
 
1
  {
2
+ "epoch": 2.9964020148716717,
3
+ "eval_loss": 0.9444097280502319,
4
+ "eval_runtime": 742.9655,
5
+ "eval_samples_per_second": 9.974,
6
+ "eval_steps_per_second": 0.625
7
  }
model-00001-of-00031.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:43b7ea5c62d1fa15d60dacef63ac29e519920b6d266af56d2d5ec1c0325767c1
3
  size 4548798728
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:bb8759ff915cea07e7a3c69f183c5e25af915b1f2cace2be87a965bb320484ef
3
  size 4548798728
model-00002-of-00031.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:37469788e026a971afe4d061435e4c77437567306852b177b35e9e6010914851
3
  size 4964101384
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:79c2826c011861b78059a360d7ec39a4a42d12ea86f0251f5dc39e8e70638d02
3
  size 4964101384
model-00003-of-00031.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:0f44fc363178a09b3f798c6af947a7e2150796f9f321f43fdce21acac6c92408
3
  size 4781637328
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c6d5f9b87f61e06053265562da0d4750a983ea1112fc3f697236b1f1a3abedc2
3
  size 4781637328
model-00004-of-00031.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:2c4bfb24f814c13c473d08d9f7768f8afc648d3d723ce282770a33141657a29e
3
  size 4781670320
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:425587fd272578780ac7a4105bb1a15bd18d2821269bc5ad8ebc79f0fe3714d5
3
  size 4781670320
model-00005-of-00031.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:69bde8ff91dad1de422da6076257a81cff31d1fbf2740e2430074919cfd7c4cf
3
  size 4781670360
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d048ab9320bacaac332209e85bcf63a7ba46c0b0c1ac6e8de8d80e0083dd111a
3
  size 4781670360
model-00006-of-00031.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:22290b1b6327b5510c774cb5ffe102f9770169d3bd4f057b8d84ad32aa4fb039
3
  size 4964101416
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7f0fcaa1be402eb8801c35da7fb371b8b0a2cfaa5928045bced5863b8ded20e8
3
  size 4964101416
model-00007-of-00031.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:025cf7656f41047c5ce6b0b83498d04e3018ec5b30514b8f54a2547cd21df9fe
3
  size 4781637360
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d9094ab00a315f7e1f5108034351ff89433a59cc118929b754c7d81620ea37ae
3
  size 4781637360
model-00008-of-00031.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:76f7117e8b5fd2b3ef392a75af08dcdac526645a67efc22ae7a688fef54f8cd3
3
  size 4781670360
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7cc520428b6f804914aa9c7e79ed1cd8dd5fd3019e1496e5c054da7e26c21e59
3
  size 4781670360
model-00009-of-00031.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:605aa2be9123c1b363d9f1e94550c66845ab146f05e640639ad448114f16aedd
3
  size 4781670360
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a599f1cd0af31de43fa81b709d65258c215efbd60e96d064af161b0d8fab53ed
3
  size 4781670360
model-00010-of-00031.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:086f5e40f8f8482499dbac441a4731383ec481334893848a49caa1dd213978c3
3
  size 4964101416
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:606c6b4eabb5abb3564f0cb07f829cef93f8326c7287d11ca18ccfaa6e825142
3
  size 4964101416
model-00011-of-00031.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:d05174227980bebc7e5e4a798aab7b750b260fe147ad87a479680ad8f0b541d4
3
  size 4781637360
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9d6b7e39d87b02214e37219779f3763de48a61b6b659226fe664817dcf29054e
3
  size 4781637360
model-00012-of-00031.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:dfc091fe38b032a4aacfd98163c166fdae5f203addbf9be4f419c41c8f58a593
3
  size 4781670360
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c7432b14dbac12bce118ee55feeb3481c877055ad201a6e97f43881cc76cc78e
3
  size 4781670360
model-00013-of-00031.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:cf039aaa028a42bb8ae9901469b5166b476df0b0f1efa5c857a912687322fafe
3
  size 4781670360
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1841a6cf419ff5ceeb289e2195ab0e66c6e7c2b43539039ccd9851c956486beb
3
  size 4781670360
model-00014-of-00031.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:5419ef25fc50600a82b0072b4314244ab75f9ed4a858b3386f08f6c4c4375c40
3
  size 4964101416
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:915c5be1782cb0d4673a0971d7e83b3f47d0a94b7063363c7aae5fdc37cfbce2
3
  size 4964101416
model-00015-of-00031.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:15d810f81d7a280b43a0fe646d8b5363990f822866fe279f51a74bd042cbaa0e
3
  size 4781637360
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:de2d9e8929c7347e73017be77865838f0790853427cfa42d7ee4d09dc7746fb6
3
  size 4781637360
model-00016-of-00031.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:e1382ef8accb5bb64aa345a62661b5f4b62bbfc92eb02cb178ea9ec81962f9e2
3
  size 4781670360
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:50d9e2d3ea53a340c1eacf42b97eb447c69b7f6dcc83a972bb6479fa75277c79
3
  size 4781670360
model-00017-of-00031.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:061a0a61973afc255f91f822cdbe3cafb5e28952d4f821199c82eafa2582e84f
3
  size 4781670360
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:afdb3e03f0c73ca1919f442cf7d783da110d4db304fdb60ebe55fdfd8e9b0a5e
3
  size 4781670360
model-00018-of-00031.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:19afcb6ac2b57db540b785f3d1c3cf419fb986c71f0f582e6967bb10d96d4fa4
3
  size 4964101416
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:187c8e1ec81c3ee9d60f64818e83935e7a721170504447ad94ed74b6c79736d1
3
  size 4964101416
model-00019-of-00031.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:e3d3a722ad4cbcab6d3d22019758a4072dde55676227ecd6217ce99b17e3ee64
3
  size 4781637360
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:05a1ccb41e0ab9c98507e1c1298dca4f906d11ccf7d210fbe49ceaa18c1b3f7c
3
  size 4781637360
model-00020-of-00031.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:9889d68c46cc5819a36ef9a252edd7ff6886544bf600c9bbe75f8dd329a8396f
3
  size 4781670360
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e15ebf66f217cc4759d306fd2bc2f0d519ddb1ca74499b4ed2eda26fc85fba87
3
  size 4781670360
model-00021-of-00031.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:3dbab9fc367987c2025ff0d032f77272f1a5acc310db34518ce899ed8e713671
3
  size 4781670360
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7782af60585fc270846fb17a801aac4ae34d8eb452748db67e17fdf1c43a6f53
3
  size 4781670360
model-00022-of-00031.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:2d4395c4226c6356e8746271c7b4f702d5a44c463e6b229e8d64e73f148328d6
3
  size 4964101416
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a3049cb31bfca83642bffd42722ff72cc049a9ca49bd52e51b3c6e5a891d57c9
3
  size 4964101416
model-00023-of-00031.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:a488d5b21b86bf12cae483ae0d8316c20774ff52a28e9c82119e6208d5537d2d
3
  size 4781637360
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:746a39ded267625e203a3885f8aa22450517acfd5115ed4f76c4f3f449b61177
3
  size 4781637360
model-00024-of-00031.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:4e20bd1c3d0fb02d2a98c2647c35428232d582364209fa5128f819e1be6d2df3
3
  size 4781670360
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cc226df3b9f41345a076b84c2efb0dd70a94e1eab62d7c239d1cbe29cae9949d
3
  size 4781670360
model-00025-of-00031.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:c3bdaa7d25301ac60d05d175b63aa501eee79bc0325d2bba32c16c76b7499fe1
3
  size 4781670360
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a4b37553eb10075cb26c12deaf7e7c2a8e078fd5fbf8b1459aef948df4f2997b
3
  size 4781670360
model-00026-of-00031.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:0d7a849cf3b3ab6d9a653cf79578e892baeeba5885c1170b5d3c78ac19d483a1
3
  size 4964101416
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1d77373b7e74461abdf6940267341b8c5520e425ffc08e73e3c66120caf054d1
3
  size 4964101416
model-00027-of-00031.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:a3f1d016dcd0a2755c2b51afbce00e4fbd3fbe25e1bdfb805a5097147bf2611c
3
  size 4781637360
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f2bdf795798cf1aab4c3e5ddac541dae5761f6510ccdebe79237b24e5f69d441
3
  size 4781637360
model-00028-of-00031.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:b50782dd5068979cc77b3809fddc9ac1196bc604c0873988710d0de60b476da9
3
  size 4781670360
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2db8c732238a30a17a28613fe73b80976061bd5db90755a3ff28b6e33cb3248e
3
  size 4781670360
model-00029-of-00031.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:973d0d8ae68989048f87cca364ff106ead0946e82fbafadcce5c29c9c19c4c82
3
  size 4781670360
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:663249faff6efe37e051a9aaccc71261b2da33cbb24ef206521ff5b4e0c58374
3
  size 4781670360
model-00030-of-00031.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:45b4affcb9769b30af316145d42a5b02fc61cddf2ef2c486398a19305e55850c
3
  size 3208747032
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:656bb4379b28844acf96b3f66f35c0eb05bc260e9bf5f7806ef5f86153ec6b24
3
  size 3208747032
model-00031-of-00031.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:c822e56b16330f9cbe0e6ec191a6fc1f17496b9806bdbfed1d32334689bb5082
3
  size 2491416704
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1c46c7193ab62a216807498d445f89b75dbbfeb91c68cd69809b0d030ab82830
3
  size 2491416704
special_tokens_map.json CHANGED
@@ -15,7 +15,7 @@
15
  "<|video_pad|>"
16
  ],
17
  "eos_token": {
18
- "content": "<|im_end|>",
19
  "lstrip": false,
20
  "normalized": false,
21
  "rstrip": false,
 
15
  "<|video_pad|>"
16
  ],
17
  "eos_token": {
18
+ "content": "<|endoftext|>",
19
  "lstrip": false,
20
  "normalized": false,
21
  "rstrip": false,
tokenizer_config.json CHANGED
@@ -197,7 +197,7 @@
197
  "bos_token": null,
198
  "chat_template": "{%- if tools %}\n {{- '<|im_start|>system\\n' }}\n {%- if messages[0]['role'] == 'system' %}\n {{- messages[0]['content'] }}\n {%- else %}\n {{- 'You are CantoneseLLM, created by hon9kon9ize. You are a helpful assistant.' }}\n {%- endif %}\n {{- \"\\n\\n# Tools\\n\\nYou may call one or more functions to assist with the user query.\\n\\nYou are provided with function signatures within <tools></tools> XML tags:\\n<tools>\" }}\n {%- for tool in tools %}\n {{- \"\\n\" }}\n {{- tool | tojson }}\n {%- endfor %}\n {{- \"\\n</tools>\\n\\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\\n<tool_call>\\n{\\\"name\\\": <function-name>, \\\"arguments\\\": <args-json-object>}\\n</tool_call><|im_end|>\\n\" }}\n{%- else %}\n {%- if messages[0]['role'] == 'system' %}\n {{- '<|im_start|>system\\n' + messages[0]['content'] + '<|im_end|>\\n' }}\n {%- else %}\n {{- '<|im_start|>system\\nYou are CantoneseLLM, created by hon9kon9ize. You are a helpful assistant.<|im_end|>\\n' }}\n {%- endif %}\n{%- endif %}\n{%- for message in messages %}\n {%- if (message.role == \"user\") or (message.role == \"system\" and not loop.first) or (message.role == \"assistant\" and not message.tool_calls) %}\n {{- '<|im_start|>' + message.role + '\\n' + message.content + '<|im_end|>' + '\\n' }}\n {%- elif message.role == \"assistant\" %}\n {{- '<|im_start|>' + message.role }}\n {%- if message.content %}\n {{- '\\n' + message.content }}\n {%- endif %}\n {%- for tool_call in message.tool_calls %}\n {%- if tool_call.function is defined %}\n {%- set tool_call = tool_call.function %}\n {%- endif %}\n {{- '\\n<tool_call>\\n{\"name\": \"' }}\n {{- tool_call.name }}\n {{- '\", \"arguments\": ' }}\n {{- tool_call.arguments | tojson }}\n {{- '}\\n</tool_call>' }}\n {%- endfor %}\n {{- '<|im_end|>\\n' }}\n {%- elif message.role == \"tool\" %}\n {%- if (loop.index0 == 0) or (messages[loop.index0 - 1].role != \"tool\") %}\n {{- '<|im_start|>user' }}\n {%- endif %}\n {{- '\\n<tool_response>\\n' }}\n {{- message.content }}\n {{- '\\n</tool_response>' }}\n {%- if loop.last or (messages[loop.index0 + 1].role != \"tool\") %}\n {{- '<|im_end|>\\n' }}\n {%- endif %}\n {%- endif %}\n{%- endfor %}\n{%- if add_generation_prompt %}\n {{- '<|im_start|>assistant\\n' }}\n{%- endif %}\n",
199
  "clean_up_tokenization_spaces": false,
200
- "eos_token": "<|im_end|>",
201
  "errors": "replace",
202
  "model_max_length": 131072,
203
  "pad_token": "<|endoftext|>",
 
197
  "bos_token": null,
198
  "chat_template": "{%- if tools %}\n {{- '<|im_start|>system\\n' }}\n {%- if messages[0]['role'] == 'system' %}\n {{- messages[0]['content'] }}\n {%- else %}\n {{- 'You are CantoneseLLM, created by hon9kon9ize. You are a helpful assistant.' }}\n {%- endif %}\n {{- \"\\n\\n# Tools\\n\\nYou may call one or more functions to assist with the user query.\\n\\nYou are provided with function signatures within <tools></tools> XML tags:\\n<tools>\" }}\n {%- for tool in tools %}\n {{- \"\\n\" }}\n {{- tool | tojson }}\n {%- endfor %}\n {{- \"\\n</tools>\\n\\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\\n<tool_call>\\n{\\\"name\\\": <function-name>, \\\"arguments\\\": <args-json-object>}\\n</tool_call><|im_end|>\\n\" }}\n{%- else %}\n {%- if messages[0]['role'] == 'system' %}\n {{- '<|im_start|>system\\n' + messages[0]['content'] + '<|im_end|>\\n' }}\n {%- else %}\n {{- '<|im_start|>system\\nYou are CantoneseLLM, created by hon9kon9ize. You are a helpful assistant.<|im_end|>\\n' }}\n {%- endif %}\n{%- endif %}\n{%- for message in messages %}\n {%- if (message.role == \"user\") or (message.role == \"system\" and not loop.first) or (message.role == \"assistant\" and not message.tool_calls) %}\n {{- '<|im_start|>' + message.role + '\\n' + message.content + '<|im_end|>' + '\\n' }}\n {%- elif message.role == \"assistant\" %}\n {{- '<|im_start|>' + message.role }}\n {%- if message.content %}\n {{- '\\n' + message.content }}\n {%- endif %}\n {%- for tool_call in message.tool_calls %}\n {%- if tool_call.function is defined %}\n {%- set tool_call = tool_call.function %}\n {%- endif %}\n {{- '\\n<tool_call>\\n{\"name\": \"' }}\n {{- tool_call.name }}\n {{- '\", \"arguments\": ' }}\n {{- tool_call.arguments | tojson }}\n {{- '}\\n</tool_call>' }}\n {%- endfor %}\n {{- '<|im_end|>\\n' }}\n {%- elif message.role == \"tool\" %}\n {%- if (loop.index0 == 0) or (messages[loop.index0 - 1].role != \"tool\") %}\n {{- '<|im_start|>user' }}\n {%- endif %}\n {{- '\\n<tool_response>\\n' }}\n {{- message.content }}\n {{- '\\n</tool_response>' }}\n {%- if loop.last or (messages[loop.index0 + 1].role != \"tool\") %}\n {{- '<|im_end|>\\n' }}\n {%- endif %}\n {%- endif %}\n{%- endfor %}\n{%- if add_generation_prompt %}\n {{- '<|im_start|>assistant\\n' }}\n{%- endif %}\n",
199
  "clean_up_tokenization_spaces": false,
200
+ "eos_token": "<|endoftext|>",
201
  "errors": "replace",
202
  "model_max_length": 131072,
203
  "pad_token": "<|endoftext|>",
train_results.json CHANGED
@@ -1,8 +1,8 @@
1
  {
2
- "epoch": 2.9992804029743345,
3
- "total_flos": 3.055078882714583e+17,
4
- "train_loss": 0.20201521048409316,
5
- "train_runtime": 48283.8245,
6
- "train_samples_per_second": 4.144,
7
- "train_steps_per_second": 0.129
8
  }
 
1
  {
2
+ "epoch": 2.9964020148716717,
3
+ "total_flos": 5.073775214995702e+17,
4
+ "train_loss": 0.5172660127031643,
5
+ "train_runtime": 62603.1223,
6
+ "train_samples_per_second": 3.196,
7
+ "train_steps_per_second": 0.033
8
  }
trainer_log.jsonl CHANGED
The diff for this file is too large to render. See raw diff
 
trainer_state.json CHANGED
The diff for this file is too large to render. See raw diff
 
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:2e8ba7b9430145fd4d8652a083b441e4ef960834e061b434c8e9eb601fd64708
3
- size 7352
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:699d52cf06aa151f43cc82bf48ddba6f12cc6270440e0acbb8fbc0154ff38597
3
+ size 7288
training_eval_loss.png CHANGED
training_loss.png CHANGED