indiejoseph commited on 15 days ago

Commit

6abfbf6

verified ·

1 Parent(s): cb6c7f7

Upload folder using huggingface_hub

Browse files

Files changed (45) hide show

.gitattributes +1 -0
README.md +34 -72
all_results.json +10 -10
cantonese_llm_v1.jpg +3 -0
config.json +1 -1
eval_results.json +5 -5
model-00001-of-00031.safetensors +1 -1
model-00002-of-00031.safetensors +1 -1
model-00003-of-00031.safetensors +1 -1
model-00004-of-00031.safetensors +1 -1
model-00005-of-00031.safetensors +1 -1
model-00006-of-00031.safetensors +1 -1
model-00007-of-00031.safetensors +1 -1
model-00008-of-00031.safetensors +1 -1
model-00009-of-00031.safetensors +1 -1
model-00010-of-00031.safetensors +1 -1
model-00011-of-00031.safetensors +1 -1
model-00012-of-00031.safetensors +1 -1
model-00013-of-00031.safetensors +1 -1
model-00014-of-00031.safetensors +1 -1
model-00015-of-00031.safetensors +1 -1
model-00016-of-00031.safetensors +1 -1
model-00017-of-00031.safetensors +1 -1
model-00018-of-00031.safetensors +1 -1
model-00019-of-00031.safetensors +1 -1
model-00020-of-00031.safetensors +1 -1
model-00021-of-00031.safetensors +1 -1
model-00022-of-00031.safetensors +1 -1
model-00023-of-00031.safetensors +1 -1
model-00024-of-00031.safetensors +1 -1
model-00025-of-00031.safetensors +1 -1
model-00026-of-00031.safetensors +1 -1
model-00027-of-00031.safetensors +1 -1
model-00028-of-00031.safetensors +1 -1
model-00029-of-00031.safetensors +1 -1
model-00030-of-00031.safetensors +1 -1
model-00031-of-00031.safetensors +1 -1
special_tokens_map.json +1 -1
tokenizer_config.json +1 -1
train_results.json +6 -6
trainer_log.jsonl +0 -0
trainer_state.json +0 -0
training_args.bin +2 -2
training_eval_loss.png +0 -0
training_loss.png +0 -0

.gitattributes CHANGED Viewed

@@ -34,3 +34,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
 tokenizer.json filter=lfs diff=lfs merge=lfs -text

 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
 tokenizer.json filter=lfs diff=lfs merge=lfs -text
+cantonese_llm_v1.jpg filter=lfs diff=lfs merge=lfs -text

README.md CHANGED Viewed

@@ -1,89 +1,51 @@
 ---
-library_name: transformers
 license: other
-base_model: hon9kon9ize/CantoneseLLM-v1.0-72B
 tags:
 - llama-factory
 - full
 - generated_from_trainer
 model-index:
 - name: CantoneseLLMChat-v1.0-72B
   results: []
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
 # CantoneseLLMChat-v1.0-72B
-This model is a fine-tuned version of [hon9kon9ize/CantoneseLLM-v1.0-72B](https://huggingface.co/hon9kon9ize/CantoneseLLM-v1.0-72B) on the sft_v1 dataset.
-It achieves the following results on the evaluation set:
-- Loss: 0.9810
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
-## Training procedure
-### Training hyperparameters
-The following hyperparameters were used during training:
-- learning_rate: 1e-05
-- train_batch_size: 1
-- eval_batch_size: 1
-- seed: 42
-- distributed_type: multi-GPU
-- num_devices: 16
-- gradient_accumulation_steps: 2
-- total_train_batch_size: 32
-- total_eval_batch_size: 16
-- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
-- lr_scheduler_type: cosine
-- lr_scheduler_warmup_ratio: 0.1
-- num_epochs: 3.0
-### Training results
-| Training Loss | Epoch  | Step | Validation Loss |
-|:-------------:|:------:|:----:|:---------------:|
-| 1.0194        | 0.0480 | 100  | 1.0800          |
-| 0.8006        | 0.0959 | 200  | 0.8958          |
-| 0.8183        | 0.1439 | 300  | 0.8850          |
-| 0.7918        | 0.1919 | 400  | 0.8789          |
-| 0.8661        | 0.2399 | 500  | 0.8784          |
-| 0.9071        | 0.2878 | 600  | 0.8993          |
-| 0.8955        | 0.3358 | 700  | 0.8928          |
-| 0.89          | 0.3838 | 800  | 0.8971          |
-| 0.8446        | 0.4318 | 900  | 0.8920          |
-| 0.8908        | 0.4797 | 1000 | 0.8980          |
-| 0.8806        | 0.5277 | 1100 | 0.8870          |
-| 0.8549        | 0.5757 | 1200 | 0.8887          |
-| 0.9197        | 0.6237 | 1300 | 0.8914          |
-| 0.8864        | 0.6716 | 1400 | 0.8827          |
-| 0.8231        | 0.7196 | 1500 | 0.8758          |
-| 0.8658        | 0.7676 | 1600 | 0.8723          |
-| 0.8506        | 0.8155 | 1700 | 0.8722          |
-| 0.9533        | 0.8635 | 1800 | 0.8710          |
-| 0.7901        | 0.9115 | 1900 | 0.8655          |
-| 0.8306        | 0.9595 | 2000 | 0.8639          |
-| 0.4431        | 1.4392 | 3000 | 0.8874          |
-| 0.4682        | 1.9189 | 4000 | 0.8693          |
-| 0.139         | 2.3987 | 5000 | 0.9829          |
-| 0.146         | 2.8784 | 6000 | 0.9811          |
-### Framework versions
-- Transformers 4.46.1
-- Pytorch 2.4.0+cu121
-- Datasets 3.1.0
-- Tokenizers 0.20.3

 ---
 license: other
+library_name: transformers
 tags:
 - llama-factory
 - full
 - generated_from_trainer
+base_model: hon9kon9ize/CantoneseLLM-v1.0-72B-cpt
 model-index:
 - name: CantoneseLLMChat-v1.0-72B
   results: []
 ---
 # CantoneseLLMChat-v1.0-72B
+![front_image](cantonese_llm_v1.jpg)
+Cantonese LLM Chat v1.0 is the first generation Cantonese LLM from hon9kon9ize.
+Building upon the sucess of [v0.5 preview](https://huggingface.co/hon9kon9ize/CantoneseLLMChat-v0.5), the model excels in Hong Kong related specific knowledge and Cantonese conversation.
+## Model description
+Base model obtained via Continuous Pre-Training of [Qwen 2.5 72B](https://huggingface.co/Qwen/Qwen2.5-72B) with 600 millions publicaly available Hong Kong news articles and Cantonese websites.
+Instructions fine-tuned model trained with a dataset consists of 75,000 instrutions pairs. 45,000 pairs were Cantonese insturctions generated by other LLMs and reviewed by humans.
+The model trained with 16 Nvidia H100 96GB HBM2e GPUs on [Genkai Supercomputer](https://www.cc.kyushu-u.ac.jp/scp/eng/system/Genkai/hardware/).
+## Basic Usage
+```
+import torch
+from transformers import AutoTokenizer, AutoModelForCausalLM
+model_id = "hon9kon9ize/CantoneseLLMChat-v1.0-72B"
+tokenizer = AutoTokenizer.from_pretrained(model_id)
+model = AutoModelForCausalLM.from_pretrained(
+    model_id,
+    torch_dtype=torch.bfloat16,
+    device_map="auto",
+)
+def chat(messages, temperature=0.9, max_new_tokens=200):
+    input_ids = tokenizer.apply_chat_template(conversation=messages, tokenize=True, add_generation_prompt=True, return_tensors='pt').to('cuda:0')
+    output_ids = model.generate(input_ids, max_new_tokens=max_new_tokens, temperature=temperature)
+    response = tokenizer.decode(output_ids[0][input_ids.shape[1]:], skip_special_tokens=False)
+    return response
+prompt = "邊個係香港特首？"
+messages = [
+    {"role": "system", "content": "you are a helpful assistant."},
+    {"role": "user", "content": prompt}
+]
+print(chat(messages)) # 香港特別行政區行政長官係李家超。<|im_end|>
+```

all_results.json CHANGED Viewed

@@ -1,12 +1,12 @@
 {
-    "epoch": 2.9992804029743345,
-    "eval_loss": 0.9810025691986084,
-    "eval_runtime": 744.7507,
-    "eval_samples_per_second": 9.95,
-    "eval_steps_per_second": 0.623,
-    "total_flos": 3.055078882714583e+17,
-    "train_loss": 0.20201521048409316,
-    "train_runtime": 48283.8245,
-    "train_samples_per_second": 4.144,
-    "train_steps_per_second": 0.129
 }

 {
+    "epoch": 2.9964020148716717,
+    "eval_loss": 0.9444097280502319,
+    "eval_runtime": 742.9655,
+    "eval_samples_per_second": 9.974,
+    "eval_steps_per_second": 0.625,
+    "total_flos": 5.073775214995702e+17,
+    "train_loss": 0.5172660127031643,
+    "train_runtime": 62603.1223,
+    "train_samples_per_second": 3.196,
+    "train_steps_per_second": 0.033
 }

cantonese_llm_v1.jpg ADDED Viewed

Git LFS Details

SHA256: 3e16cb6d7cefdfe983cbc30e04e44ba3da33d51ba0ab8b575c47bf7e9b113b92
Pointer size: 131 Bytes
Size of remote file: 523 kB

config.json CHANGED Viewed

@@ -1,5 +1,5 @@
 {
-  "_name_or_path": "hon9kon9ize/CantoneseLLM-v1.0-72B",
   "architectures": [
     "Qwen2ForCausalLM"
   ],

 {
+  "_name_or_path": "/home/pj24001684/ku40000295/jc/models/Qwen72B-cpt",
   "architectures": [
     "Qwen2ForCausalLM"
   ],

eval_results.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
-    "epoch": 2.9992804029743345,
-    "eval_loss": 0.9810025691986084,
-    "eval_runtime": 744.7507,
-    "eval_samples_per_second": 9.95,
-    "eval_steps_per_second": 0.623
 }

 {
+    "epoch": 2.9964020148716717,
+    "eval_loss": 0.9444097280502319,
+    "eval_runtime": 742.9655,
+    "eval_samples_per_second": 9.974,
+    "eval_steps_per_second": 0.625
 }

model-00001-of-00031.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:43b7ea5c62d1fa15d60dacef63ac29e519920b6d266af56d2d5ec1c0325767c1
 size 4548798728

 version https://git-lfs.github.com/spec/v1
+oid sha256:bb8759ff915cea07e7a3c69f183c5e25af915b1f2cace2be87a965bb320484ef
 size 4548798728

model-00002-of-00031.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:37469788e026a971afe4d061435e4c77437567306852b177b35e9e6010914851
 size 4964101384

 version https://git-lfs.github.com/spec/v1
+oid sha256:79c2826c011861b78059a360d7ec39a4a42d12ea86f0251f5dc39e8e70638d02
 size 4964101384

model-00003-of-00031.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:0f44fc363178a09b3f798c6af947a7e2150796f9f321f43fdce21acac6c92408
 size 4781637328

 version https://git-lfs.github.com/spec/v1
+oid sha256:c6d5f9b87f61e06053265562da0d4750a983ea1112fc3f697236b1f1a3abedc2
 size 4781637328

model-00004-of-00031.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:2c4bfb24f814c13c473d08d9f7768f8afc648d3d723ce282770a33141657a29e
 size 4781670320

 version https://git-lfs.github.com/spec/v1
+oid sha256:425587fd272578780ac7a4105bb1a15bd18d2821269bc5ad8ebc79f0fe3714d5
 size 4781670320

model-00005-of-00031.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:69bde8ff91dad1de422da6076257a81cff31d1fbf2740e2430074919cfd7c4cf
 size 4781670360

 version https://git-lfs.github.com/spec/v1
+oid sha256:d048ab9320bacaac332209e85bcf63a7ba46c0b0c1ac6e8de8d80e0083dd111a
 size 4781670360

model-00006-of-00031.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:22290b1b6327b5510c774cb5ffe102f9770169d3bd4f057b8d84ad32aa4fb039
 size 4964101416

 version https://git-lfs.github.com/spec/v1
+oid sha256:7f0fcaa1be402eb8801c35da7fb371b8b0a2cfaa5928045bced5863b8ded20e8
 size 4964101416

model-00007-of-00031.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:025cf7656f41047c5ce6b0b83498d04e3018ec5b30514b8f54a2547cd21df9fe
 size 4781637360

 version https://git-lfs.github.com/spec/v1
+oid sha256:d9094ab00a315f7e1f5108034351ff89433a59cc118929b754c7d81620ea37ae
 size 4781637360

model-00008-of-00031.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:76f7117e8b5fd2b3ef392a75af08dcdac526645a67efc22ae7a688fef54f8cd3
 size 4781670360

 version https://git-lfs.github.com/spec/v1
+oid sha256:7cc520428b6f804914aa9c7e79ed1cd8dd5fd3019e1496e5c054da7e26c21e59
 size 4781670360

model-00009-of-00031.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:605aa2be9123c1b363d9f1e94550c66845ab146f05e640639ad448114f16aedd
 size 4781670360

 version https://git-lfs.github.com/spec/v1
+oid sha256:a599f1cd0af31de43fa81b709d65258c215efbd60e96d064af161b0d8fab53ed
 size 4781670360

model-00010-of-00031.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:086f5e40f8f8482499dbac441a4731383ec481334893848a49caa1dd213978c3
 size 4964101416

 version https://git-lfs.github.com/spec/v1
+oid sha256:606c6b4eabb5abb3564f0cb07f829cef93f8326c7287d11ca18ccfaa6e825142
 size 4964101416

model-00011-of-00031.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:d05174227980bebc7e5e4a798aab7b750b260fe147ad87a479680ad8f0b541d4
 size 4781637360

 version https://git-lfs.github.com/spec/v1
+oid sha256:9d6b7e39d87b02214e37219779f3763de48a61b6b659226fe664817dcf29054e
 size 4781637360

model-00012-of-00031.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:dfc091fe38b032a4aacfd98163c166fdae5f203addbf9be4f419c41c8f58a593
 size 4781670360

 version https://git-lfs.github.com/spec/v1
+oid sha256:c7432b14dbac12bce118ee55feeb3481c877055ad201a6e97f43881cc76cc78e
 size 4781670360

model-00013-of-00031.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:cf039aaa028a42bb8ae9901469b5166b476df0b0f1efa5c857a912687322fafe
 size 4781670360

 version https://git-lfs.github.com/spec/v1
+oid sha256:1841a6cf419ff5ceeb289e2195ab0e66c6e7c2b43539039ccd9851c956486beb
 size 4781670360

model-00014-of-00031.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:5419ef25fc50600a82b0072b4314244ab75f9ed4a858b3386f08f6c4c4375c40
 size 4964101416

 version https://git-lfs.github.com/spec/v1
+oid sha256:915c5be1782cb0d4673a0971d7e83b3f47d0a94b7063363c7aae5fdc37cfbce2
 size 4964101416

model-00015-of-00031.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:15d810f81d7a280b43a0fe646d8b5363990f822866fe279f51a74bd042cbaa0e
 size 4781637360

 version https://git-lfs.github.com/spec/v1
+oid sha256:de2d9e8929c7347e73017be77865838f0790853427cfa42d7ee4d09dc7746fb6
 size 4781637360

model-00016-of-00031.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:e1382ef8accb5bb64aa345a62661b5f4b62bbfc92eb02cb178ea9ec81962f9e2
 size 4781670360

 version https://git-lfs.github.com/spec/v1
+oid sha256:50d9e2d3ea53a340c1eacf42b97eb447c69b7f6dcc83a972bb6479fa75277c79
 size 4781670360

model-00017-of-00031.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:061a0a61973afc255f91f822cdbe3cafb5e28952d4f821199c82eafa2582e84f
 size 4781670360

 version https://git-lfs.github.com/spec/v1
+oid sha256:afdb3e03f0c73ca1919f442cf7d783da110d4db304fdb60ebe55fdfd8e9b0a5e
 size 4781670360

model-00018-of-00031.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:19afcb6ac2b57db540b785f3d1c3cf419fb986c71f0f582e6967bb10d96d4fa4
 size 4964101416

 version https://git-lfs.github.com/spec/v1
+oid sha256:187c8e1ec81c3ee9d60f64818e83935e7a721170504447ad94ed74b6c79736d1
 size 4964101416

model-00019-of-00031.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:e3d3a722ad4cbcab6d3d22019758a4072dde55676227ecd6217ce99b17e3ee64
 size 4781637360

 version https://git-lfs.github.com/spec/v1
+oid sha256:05a1ccb41e0ab9c98507e1c1298dca4f906d11ccf7d210fbe49ceaa18c1b3f7c
 size 4781637360

model-00020-of-00031.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:9889d68c46cc5819a36ef9a252edd7ff6886544bf600c9bbe75f8dd329a8396f
 size 4781670360

 version https://git-lfs.github.com/spec/v1
+oid sha256:e15ebf66f217cc4759d306fd2bc2f0d519ddb1ca74499b4ed2eda26fc85fba87
 size 4781670360

model-00021-of-00031.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:3dbab9fc367987c2025ff0d032f77272f1a5acc310db34518ce899ed8e713671
 size 4781670360

 version https://git-lfs.github.com/spec/v1
+oid sha256:7782af60585fc270846fb17a801aac4ae34d8eb452748db67e17fdf1c43a6f53
 size 4781670360

model-00022-of-00031.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:2d4395c4226c6356e8746271c7b4f702d5a44c463e6b229e8d64e73f148328d6
 size 4964101416

 version https://git-lfs.github.com/spec/v1
+oid sha256:a3049cb31bfca83642bffd42722ff72cc049a9ca49bd52e51b3c6e5a891d57c9
 size 4964101416

model-00023-of-00031.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:a488d5b21b86bf12cae483ae0d8316c20774ff52a28e9c82119e6208d5537d2d
 size 4781637360

 version https://git-lfs.github.com/spec/v1
+oid sha256:746a39ded267625e203a3885f8aa22450517acfd5115ed4f76c4f3f449b61177
 size 4781637360

model-00024-of-00031.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:4e20bd1c3d0fb02d2a98c2647c35428232d582364209fa5128f819e1be6d2df3
 size 4781670360

 version https://git-lfs.github.com/spec/v1
+oid sha256:cc226df3b9f41345a076b84c2efb0dd70a94e1eab62d7c239d1cbe29cae9949d
 size 4781670360

model-00025-of-00031.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:c3bdaa7d25301ac60d05d175b63aa501eee79bc0325d2bba32c16c76b7499fe1
 size 4781670360

 version https://git-lfs.github.com/spec/v1
+oid sha256:a4b37553eb10075cb26c12deaf7e7c2a8e078fd5fbf8b1459aef948df4f2997b
 size 4781670360

model-00026-of-00031.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:0d7a849cf3b3ab6d9a653cf79578e892baeeba5885c1170b5d3c78ac19d483a1
 size 4964101416

 version https://git-lfs.github.com/spec/v1
+oid sha256:1d77373b7e74461abdf6940267341b8c5520e425ffc08e73e3c66120caf054d1
 size 4964101416

model-00027-of-00031.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:a3f1d016dcd0a2755c2b51afbce00e4fbd3fbe25e1bdfb805a5097147bf2611c
 size 4781637360

 version https://git-lfs.github.com/spec/v1
+oid sha256:f2bdf795798cf1aab4c3e5ddac541dae5761f6510ccdebe79237b24e5f69d441
 size 4781637360

model-00028-of-00031.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:b50782dd5068979cc77b3809fddc9ac1196bc604c0873988710d0de60b476da9
 size 4781670360

 version https://git-lfs.github.com/spec/v1
+oid sha256:2db8c732238a30a17a28613fe73b80976061bd5db90755a3ff28b6e33cb3248e
 size 4781670360

model-00029-of-00031.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:973d0d8ae68989048f87cca364ff106ead0946e82fbafadcce5c29c9c19c4c82
 size 4781670360

 version https://git-lfs.github.com/spec/v1
+oid sha256:663249faff6efe37e051a9aaccc71261b2da33cbb24ef206521ff5b4e0c58374
 size 4781670360

model-00030-of-00031.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:45b4affcb9769b30af316145d42a5b02fc61cddf2ef2c486398a19305e55850c
 size 3208747032

 version https://git-lfs.github.com/spec/v1
+oid sha256:656bb4379b28844acf96b3f66f35c0eb05bc260e9bf5f7806ef5f86153ec6b24
 size 3208747032

model-00031-of-00031.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:c822e56b16330f9cbe0e6ec191a6fc1f17496b9806bdbfed1d32334689bb5082
 size 2491416704

 version https://git-lfs.github.com/spec/v1
+oid sha256:1c46c7193ab62a216807498d445f89b75dbbfeb91c68cd69809b0d030ab82830
 size 2491416704

special_tokens_map.json CHANGED Viewed

@@ -15,7 +15,7 @@
     "<|video_pad|>"
   ],
   "eos_token": {
-    "content": "<|im_end|>",
     "lstrip": false,
     "normalized": false,
     "rstrip": false,

     "<|video_pad|>"
   ],
   "eos_token": {
+    "content": "<|endoftext|>",
     "lstrip": false,
     "normalized": false,
     "rstrip": false,

tokenizer_config.json CHANGED Viewed

@@ -197,7 +197,7 @@
   "bos_token": null,
   "chat_template": "{%- if tools %}\n    {{- '<|im_start|>system\\n' }}\n    {%- if messages[0]['role'] == 'system' %}\n        {{- messages[0]['content'] }}\n    {%- else %}\n        {{- 'You are CantoneseLLM, created by hon9kon9ize. You are a helpful assistant.' }}\n    {%- endif %}\n    {{- \"\\n\\n# Tools\\n\\nYou may call one or more functions to assist with the user query.\\n\\nYou are provided with function signatures within <tools></tools> XML tags:\\n<tools>\" }}\n    {%- for tool in tools %}\n        {{- \"\\n\" }}\n        {{- tool | tojson }}\n    {%- endfor %}\n    {{- \"\\n</tools>\\n\\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\\n<tool_call>\\n{\\\"name\\\": <function-name>, \\\"arguments\\\": <args-json-object>}\\n</tool_call><|im_end|>\\n\" }}\n{%- else %}\n    {%- if messages[0]['role'] == 'system' %}\n        {{- '<|im_start|>system\\n' + messages[0]['content'] + '<|im_end|>\\n' }}\n    {%- else %}\n        {{- '<|im_start|>system\\nYou are CantoneseLLM, created by hon9kon9ize. You are a helpful assistant.<|im_end|>\\n' }}\n    {%- endif %}\n{%- endif %}\n{%- for message in messages %}\n    {%- if (message.role == \"user\") or (message.role == \"system\" and not loop.first) or (message.role == \"assistant\" and not message.tool_calls) %}\n        {{- '<|im_start|>' + message.role + '\\n' + message.content + '<|im_end|>' + '\\n' }}\n    {%- elif message.role == \"assistant\" %}\n        {{- '<|im_start|>' + message.role }}\n        {%- if message.content %}\n            {{- '\\n' + message.content }}\n        {%- endif %}\n        {%- for tool_call in message.tool_calls %}\n            {%- if tool_call.function is defined %}\n                {%- set tool_call = tool_call.function %}\n            {%- endif %}\n            {{- '\\n<tool_call>\\n{\"name\": \"' }}\n            {{- tool_call.name }}\n            {{- '\", \"arguments\": ' }}\n            {{- tool_call.arguments | tojson }}\n            {{- '}\\n</tool_call>' }}\n        {%- endfor %}\n        {{- '<|im_end|>\\n' }}\n    {%- elif message.role == \"tool\" %}\n        {%- if (loop.index0 == 0) or (messages[loop.index0 - 1].role != \"tool\") %}\n            {{- '<|im_start|>user' }}\n        {%- endif %}\n        {{- '\\n<tool_response>\\n' }}\n        {{- message.content }}\n        {{- '\\n</tool_response>' }}\n        {%- if loop.last or (messages[loop.index0 + 1].role != \"tool\") %}\n            {{- '<|im_end|>\\n' }}\n        {%- endif %}\n    {%- endif %}\n{%- endfor %}\n{%- if add_generation_prompt %}\n    {{- '<|im_start|>assistant\\n' }}\n{%- endif %}\n",
   "clean_up_tokenization_spaces": false,
-  "eos_token": "<|im_end|>",
   "errors": "replace",
   "model_max_length": 131072,
   "pad_token": "<|endoftext|>",

   "bos_token": null,
   "chat_template": "{%- if tools %}\n    {{- '<|im_start|>system\\n' }}\n    {%- if messages[0]['role'] == 'system' %}\n        {{- messages[0]['content'] }}\n    {%- else %}\n        {{- 'You are CantoneseLLM, created by hon9kon9ize. You are a helpful assistant.' }}\n    {%- endif %}\n    {{- \"\\n\\n# Tools\\n\\nYou may call one or more functions to assist with the user query.\\n\\nYou are provided with function signatures within <tools></tools> XML tags:\\n<tools>\" }}\n    {%- for tool in tools %}\n        {{- \"\\n\" }}\n        {{- tool | tojson }}\n    {%- endfor %}\n    {{- \"\\n</tools>\\n\\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\\n<tool_call>\\n{\\\"name\\\": <function-name>, \\\"arguments\\\": <args-json-object>}\\n</tool_call><|im_end|>\\n\" }}\n{%- else %}\n    {%- if messages[0]['role'] == 'system' %}\n        {{- '<|im_start|>system\\n' + messages[0]['content'] + '<|im_end|>\\n' }}\n    {%- else %}\n        {{- '<|im_start|>system\\nYou are CantoneseLLM, created by hon9kon9ize. You are a helpful assistant.<|im_end|>\\n' }}\n    {%- endif %}\n{%- endif %}\n{%- for message in messages %}\n    {%- if (message.role == \"user\") or (message.role == \"system\" and not loop.first) or (message.role == \"assistant\" and not message.tool_calls) %}\n        {{- '<|im_start|>' + message.role + '\\n' + message.content + '<|im_end|>' + '\\n' }}\n    {%- elif message.role == \"assistant\" %}\n        {{- '<|im_start|>' + message.role }}\n        {%- if message.content %}\n            {{- '\\n' + message.content }}\n        {%- endif %}\n        {%- for tool_call in message.tool_calls %}\n            {%- if tool_call.function is defined %}\n                {%- set tool_call = tool_call.function %}\n            {%- endif %}\n            {{- '\\n<tool_call>\\n{\"name\": \"' }}\n            {{- tool_call.name }}\n            {{- '\", \"arguments\": ' }}\n            {{- tool_call.arguments | tojson }}\n            {{- '}\\n</tool_call>' }}\n        {%- endfor %}\n        {{- '<|im_end|>\\n' }}\n    {%- elif message.role == \"tool\" %}\n        {%- if (loop.index0 == 0) or (messages[loop.index0 - 1].role != \"tool\") %}\n            {{- '<|im_start|>user' }}\n        {%- endif %}\n        {{- '\\n<tool_response>\\n' }}\n        {{- message.content }}\n        {{- '\\n</tool_response>' }}\n        {%- if loop.last or (messages[loop.index0 + 1].role != \"tool\") %}\n            {{- '<|im_end|>\\n' }}\n        {%- endif %}\n    {%- endif %}\n{%- endfor %}\n{%- if add_generation_prompt %}\n    {{- '<|im_start|>assistant\\n' }}\n{%- endif %}\n",
   "clean_up_tokenization_spaces": false,
+  "eos_token": "<|endoftext|>",
   "errors": "replace",
   "model_max_length": 131072,
   "pad_token": "<|endoftext|>",

train_results.json CHANGED Viewed

@@ -1,8 +1,8 @@
 {
-    "epoch": 2.9992804029743345,
-    "total_flos": 3.055078882714583e+17,
-    "train_loss": 0.20201521048409316,
-    "train_runtime": 48283.8245,
-    "train_samples_per_second": 4.144,
-    "train_steps_per_second": 0.129
 }

 {
+    "epoch": 2.9964020148716717,
+    "total_flos": 5.073775214995702e+17,
+    "train_loss": 0.5172660127031643,
+    "train_runtime": 62603.1223,
+    "train_samples_per_second": 3.196,
+    "train_steps_per_second": 0.033
 }

trainer_log.jsonl CHANGED Viewed

The diff for this file is too large to render. See raw diff

trainer_state.json CHANGED Viewed

The diff for this file is too large to render. See raw diff

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:2e8ba7b9430145fd4d8652a083b441e4ef960834e061b434c8e9eb601fd64708
-size 7352

 version https://git-lfs.github.com/spec/v1
+oid sha256:699d52cf06aa151f43cc82bf48ddba6f12cc6270440e0acbb8fbc0154ff38597
+size 7288

training_eval_loss.png CHANGED Viewed

training_loss.png CHANGED Viewed