lianghsun commited on
Commit
44cd75b
·
1 Parent(s): 89cf561

Fine-tuning is based on the foundation model version v2024.12.28, and it uses self-prepared instruction datasets for this round of fine-tuning.

Browse files
README.md CHANGED
@@ -24,12 +24,9 @@ datasets:
24
  - lianghsun/tw-contract-review-chat
25
  - lianghsun/reasoning-base-20k-chat
26
  - lianghsun/vulnerability-mitigation-qa-zh_tw
27
- - benchang1110/Belle-Taide
28
  - rombodawg/Everything_Instruct_Multilingual
29
- - BAAI/Infinity-Instruct
30
- - nisaar/LLAMA2_Legal_Dataset_4.4k_Instructions
31
  - xzuyn/manythings-translations-alpaca
32
- - neural-bridge/rag-hallucination-dataset-1000
33
  - neural-bridge/rag-dataset-12000
34
  - minyichen/glaive_toolcall_zh_tw
35
  pipeline_tag: text-generation
@@ -999,6 +996,7 @@ metrics:
999
 
1000
  | Update Date | Model Version | Key Changes |
1001
  |--------------|-----------------------|-------------------------------------|
 
1002
  | 2024/11/27 | v2024.11.27 | Completed SFT training (5/5 epochs). Preparing for multi-round DPO training. |
1003
  | 2024/11/25 | v2024.11.25 | Updated model version to v2024.11.25, training progressed to (3/5) epochs. Still in SFT stage, DPO training remains pending. |
1004
  | 2024/11/22 | v2024.11.22 | Initial upload: Model version v2024.11.22, training completed up to (1/5) epochs. Currently trained only on SFT, DPO training not yet performed. |
@@ -1096,7 +1094,6 @@ docker run --runtime nvidia --gpus all \
1096
  - [lianghsun/tw-law-article-qa](https://huggingface.co/datasets/lianghsun/tw-law-article-qa)
1097
  - [lianghsun/tw-judgment-qa](https://huggingface.co/datasets/lianghsun/tw-judgment-qa)
1098
  - [lianghsun/tw-bar-examination-2020-chat](https://huggingface.co/datasets/lianghsun/tw-bar-examination-2020-chat)
1099
- - [lianghsun/tw-emergency-medicine-bench](https://huggingface.co/datasets/lianghsun/tw-emergency-medicine-bench)
1100
  - [lianghsun/tw-structured-law-article](https://huggingface.co/datasets/lianghsun/tw-structured-law-article)
1101
  - [lianghsun/tw-judgment-gist-chat](https://huggingface.co/datasets/lianghsun/tw-judgment-gist-chat)
1102
  - [lianghsun/vulnerability-mitigation-qa-zh_tw](https://huggingface.co/datasets/lianghsun/vulnerability-mitigation-qa-zh_tw)
@@ -1104,7 +1101,6 @@ docker run --runtime nvidia --gpus all \
1104
  - [lianghsun/reasoning-base-20k-chat](https://huggingface.co/datasets/lianghsun/reasoning-base-20k-chat)
1105
  - [lianghsun/tw-contract-review-chat](https://huggingface.co/datasets/lianghsun/tw-contract-review-chat)
1106
  - [lianghsun/tw-legal-methodology-chat](https://huggingface.co/datasets/lianghsun/tw-legal-methodology-chat)
1107
- - [benchang1110/Belle-Taide](https://huggingface.co/datasets/benchang1110/Belle-Taide)
1108
  - [minyichen/glaive_toolcall_zh_tw](https://huggingface.co/datasets/minyichen/glaive_toolcall_zh_tw)
1109
 
1110
  </details>
@@ -1113,11 +1109,8 @@ docker run --runtime nvidia --gpus all \
1113
  <summary><b>多國語系對話資料集</b></summary>
1114
 
1115
  - [rombodawg/Everything_Instruct_Multilingual](https://huggingface.co/datasets/rombodawg/Everything_Instruct_Multilingual)
1116
- - [BAAI/Infinity-Instruct](https://huggingface.co/datasets/BAAI/Infinity-Instruct)
1117
- - [nisaar/LLAMA2_Legal_Dataset_4.4k_Instructions](https://huggingface.co/datasets/nisaar/LLAMA2_Legal_Dataset_4.4k_Instructions)
1118
  - [xzuyn/manythings-translations-alpaca](https://huggingface.co/datasets/xzuyn/manythings-translations-alpaca)
1119
  - [neural-bridge/rag-dataset-12000](https://huggingface.co/datasets/neural-bridge/rag-dataset-12000)
1120
- - [neural-bridge/rag-hallucination-dataset-1000](https://huggingface.co/datasets/neural-bridge/rag-hallucination-dataset-1000)
1121
 
1122
  </details>
1123
 
@@ -1125,10 +1118,9 @@ docker run --runtime nvidia --gpus all \
1125
 
1126
  <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
1127
 
1128
- #### Preprocessing [optional]
1129
-
1130
- [More Information Needed]
1131
 
 
1132
 
1133
  #### Training Hyperparameters
1134
 
@@ -1416,4 +1408,4 @@ base_model: lianghsun/Llama-3.2-Taiwan-3B-Instruct
1416
  - Transformers 4.45.2
1417
  - Pytorch 2.4.1+cu121
1418
  - Datasets 2.21.0
1419
- - Tokenizers 0.20.0
 
24
  - lianghsun/tw-contract-review-chat
25
  - lianghsun/reasoning-base-20k-chat
26
  - lianghsun/vulnerability-mitigation-qa-zh_tw
27
+ - lianghsun/tw-instruct
28
  - rombodawg/Everything_Instruct_Multilingual
 
 
29
  - xzuyn/manythings-translations-alpaca
 
30
  - neural-bridge/rag-dataset-12000
31
  - minyichen/glaive_toolcall_zh_tw
32
  pipeline_tag: text-generation
 
996
 
997
  | Update Date | Model Version | Key Changes |
998
  |--------------|-----------------------|-------------------------------------|
999
+ | 2025/01/01 | v2025.01.01 | Fine-tuning is based on the [foundation model](https://huggingface.co/lianghsun/Llama-3.2-Taiwan-3B) version v2024.12.28, and it uses self-prepared instruction datasets for this round of fine-tuning. |
1000
  | 2024/11/27 | v2024.11.27 | Completed SFT training (5/5 epochs). Preparing for multi-round DPO training. |
1001
  | 2024/11/25 | v2024.11.25 | Updated model version to v2024.11.25, training progressed to (3/5) epochs. Still in SFT stage, DPO training remains pending. |
1002
  | 2024/11/22 | v2024.11.22 | Initial upload: Model version v2024.11.22, training completed up to (1/5) epochs. Currently trained only on SFT, DPO training not yet performed. |
 
1094
  - [lianghsun/tw-law-article-qa](https://huggingface.co/datasets/lianghsun/tw-law-article-qa)
1095
  - [lianghsun/tw-judgment-qa](https://huggingface.co/datasets/lianghsun/tw-judgment-qa)
1096
  - [lianghsun/tw-bar-examination-2020-chat](https://huggingface.co/datasets/lianghsun/tw-bar-examination-2020-chat)
 
1097
  - [lianghsun/tw-structured-law-article](https://huggingface.co/datasets/lianghsun/tw-structured-law-article)
1098
  - [lianghsun/tw-judgment-gist-chat](https://huggingface.co/datasets/lianghsun/tw-judgment-gist-chat)
1099
  - [lianghsun/vulnerability-mitigation-qa-zh_tw](https://huggingface.co/datasets/lianghsun/vulnerability-mitigation-qa-zh_tw)
 
1101
  - [lianghsun/reasoning-base-20k-chat](https://huggingface.co/datasets/lianghsun/reasoning-base-20k-chat)
1102
  - [lianghsun/tw-contract-review-chat](https://huggingface.co/datasets/lianghsun/tw-contract-review-chat)
1103
  - [lianghsun/tw-legal-methodology-chat](https://huggingface.co/datasets/lianghsun/tw-legal-methodology-chat)
 
1104
  - [minyichen/glaive_toolcall_zh_tw](https://huggingface.co/datasets/minyichen/glaive_toolcall_zh_tw)
1105
 
1106
  </details>
 
1109
  <summary><b>多國語系對話資料集</b></summary>
1110
 
1111
  - [rombodawg/Everything_Instruct_Multilingual](https://huggingface.co/datasets/rombodawg/Everything_Instruct_Multilingual)
 
 
1112
  - [xzuyn/manythings-translations-alpaca](https://huggingface.co/datasets/xzuyn/manythings-translations-alpaca)
1113
  - [neural-bridge/rag-dataset-12000](https://huggingface.co/datasets/neural-bridge/rag-dataset-12000)
 
1114
 
1115
  </details>
1116
 
 
1118
 
1119
  <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
1120
 
1121
+ #### Preprocessing
 
 
1122
 
1123
+ (WIP)
1124
 
1125
  #### Training Hyperparameters
1126
 
 
1408
  - Transformers 4.45.2
1409
  - Pytorch 2.4.1+cu121
1410
  - Datasets 2.21.0
1411
+ - Tokenizers 0.20.0
model-00001-of-00002.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:b2bff44179e46840bdba4d0101f5160ab7f3302625ec4e75f5a186868de2d6ee
3
  size 4965799096
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c114557d56bfbec77c9c763cc71c1a04c1668566636cfe4d1d489f3ddb9f4ff7
3
  size 4965799096
model-00002-of-00002.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:203ed4c8f522e3d6b4f08b6931d8db0c9a3aae352bdb6e19f20ad22a78a57593
3
- size 1459729952
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6e20fae2caad0bd2dadabd50a7e5524a4421ecfe29b4b3b0aad1c1d0fab9de16
3
+ size 2247734992