AlekseyKorshuk commited on
Commit
9224351
·
verified ·
1 Parent(s): 96cef03

End of training

Browse files
README.md CHANGED
@@ -1,11 +1,14 @@
1
  ---
2
  license: mit
3
- base_model: microsoft/phi-2
4
  tags:
5
  - axolotl
 
 
 
6
  - generated_from_trainer
7
  model-index:
8
- - name: ultrachat-phi-2-sft-chatml
9
  results: []
10
  ---
11
 
@@ -17,30 +20,31 @@ should probably proofread and complete it, then remove this comment. -->
17
 
18
  axolotl version: `0.4.0`
19
  ```yaml
20
- base_model: microsoft/phi-2
21
  model_type: AutoModelForCausalLM
22
  tokenizer_type: AutoTokenizer
23
  trust_remote_code: true
24
 
25
- hub_model_id: AlekseyKorshuk/ultrachat-phi-2-sft-chatml
26
  hub_strategy: every_save
27
 
28
  load_in_8bit: false
29
  load_in_4bit: false
30
  strict: false
31
 
 
32
  datasets:
33
- - path: AlekseyKorshuk/ultrachat_200k
34
- split: train_sft
35
- type: sharegpt
36
- conversation: chatml
37
 
38
  dataset_prepared_path:
39
- val_set_size: 0
40
  output_dir: ./output
41
 
42
  sequence_len: 2048
43
- sample_packing: false
44
  pad_to_sequence_len:
45
 
46
  lora_r:
@@ -53,12 +57,12 @@ lora_fan_in_fan_out:
53
  wandb_project: ui-thesis
54
  wandb_entity:
55
  wandb_watch:
56
- wandb_name: ultrachat-phi-2-sft-chatml
57
  wandb_log_model:
58
 
59
- gradient_accumulation_steps: 2
60
- micro_batch_size: 16
61
- num_epochs: 1
62
  optimizer: paged_adamw_8bit
63
  adam_beta1: 0.9
64
  adam_beta2: 0.95
@@ -66,9 +70,11 @@ max_grad_norm: 1.0
66
  adam_epsilon: 0.00001
67
  lr_scheduler: cosine
68
  cosine_min_lr_ratio: 0.1
69
- learning_rate: 4e-5
70
- warmup_ratio: 0.1
71
- weight_decay: 0.1
 
 
72
 
73
  train_on_inputs: false
74
  group_by_length: false
@@ -76,6 +82,7 @@ bf16: true
76
  fp16: false
77
  tf32: true
78
 
 
79
  gradient_checkpointing: true
80
  early_stopping_patience:
81
  resume_from_checkpoint:
@@ -85,34 +92,30 @@ xformers_attention:
85
  flash_attention: true
86
 
87
 
88
- evals_per_epoch: 0
89
- eval_table_size: 8 # Approximate number of predictions sent to wandb depending on batch size. Enabled above 0. Default is 0
90
- eval_table_max_new_tokens: 768 # Total number of tokens generated for predictions sent to wandb. Default is 128
91
- eval_sample_packing: false
92
 
93
  chat_template: chatml
94
- saves_per_epoch: 5
 
95
  save_total_limit: 1
96
  seed: 42
97
  debug:
98
  deepspeed:
99
 
 
100
  fsdp:
101
  fsdp_config:
102
  resize_token_embeddings_to_32x: true
103
 
104
- special_tokens:
105
- eos_token: "<|im_end|>"
106
- pad_token: "<|endoftext|>"
107
- tokens:
108
- - "<|im_start|>"
109
  ```
110
 
111
  </details><br>
112
 
113
- # ultrachat-phi-2-sft-chatml
114
 
115
- This model is a fine-tuned version of [microsoft/phi-2](https://huggingface.co/microsoft/phi-2) on the None dataset.
116
 
117
  ## Model description
118
 
@@ -131,19 +134,19 @@ More information needed
131
  ### Training hyperparameters
132
 
133
  The following hyperparameters were used during training:
134
- - learning_rate: 4e-05
135
- - train_batch_size: 16
136
- - eval_batch_size: 16
137
  - seed: 42
138
  - distributed_type: multi-GPU
139
  - num_devices: 4
140
- - gradient_accumulation_steps: 2
141
  - total_train_batch_size: 128
142
- - total_eval_batch_size: 64
143
  - optimizer: Adam with betas=(0.9,0.95) and epsilon=1e-05
144
  - lr_scheduler_type: cosine
145
- - lr_scheduler_warmup_steps: 36
146
- - num_epochs: 1
147
 
148
  ### Training results
149
 
 
1
  ---
2
  license: mit
3
+ base_model: AlekseyKorshuk/ultrachat-phi-2-sft-chatml
4
  tags:
5
  - axolotl
6
+ - dpo
7
+ - trl
8
+ - dpo
9
  - generated_from_trainer
10
  model-index:
11
+ - name: ultrachat-phi-2-dpo-chatml
12
  results: []
13
  ---
14
 
 
20
 
21
  axolotl version: `0.4.0`
22
  ```yaml
23
+ base_model: AlekseyKorshuk/ultrachat-phi-2-sft-chatml
24
  model_type: AutoModelForCausalLM
25
  tokenizer_type: AutoTokenizer
26
  trust_remote_code: true
27
 
28
+ hub_model_id: AlekseyKorshuk/ultrachat-phi-2-dpo-chatml
29
  hub_strategy: every_save
30
 
31
  load_in_8bit: false
32
  load_in_4bit: false
33
  strict: false
34
 
35
+ rl: dpo
36
  datasets:
37
+ - path: argilla/ultrafeedback-binarized-preferences
38
+ split: train
39
+ type: chatml.argilla
40
+
41
 
42
  dataset_prepared_path:
43
+ #val_set_size: 0.001
44
  output_dir: ./output
45
 
46
  sequence_len: 2048
47
+ #sample_packing: false # currently unsupported
48
  pad_to_sequence_len:
49
 
50
  lora_r:
 
57
  wandb_project: ui-thesis
58
  wandb_entity:
59
  wandb_watch:
60
+ wandb_name: ultrachat-phi-2-dpo-chatml
61
  wandb_log_model:
62
 
63
+ gradient_accumulation_steps: 4
64
+ micro_batch_size: 8
65
+ num_epochs: 3
66
  optimizer: paged_adamw_8bit
67
  adam_beta1: 0.9
68
  adam_beta2: 0.95
 
70
  adam_epsilon: 0.00001
71
  lr_scheduler: cosine
72
  cosine_min_lr_ratio: 0.1
73
+ learning_rate: 5.0e-7
74
+ warmup_steps: 32
75
+ #warmup_ratio: 0.1
76
+ weight_decay: 0.01
77
+ dpo_beta: 0.01
78
 
79
  train_on_inputs: false
80
  group_by_length: false
 
82
  fp16: false
83
  tf32: true
84
 
85
+
86
  gradient_checkpointing: true
87
  early_stopping_patience:
88
  resume_from_checkpoint:
 
92
  flash_attention: true
93
 
94
 
95
+ #evals_per_epoch: 5
96
+ #eval_table_size: 8 # Approximate number of predictions sent to wandb depending on batch size. Enabled above 0. Default is 0
97
+ #eval_table_max_new_tokens: 768 # Total number of tokens generated for predictions sent to wandb. Default is 128
 
98
 
99
  chat_template: chatml
100
+ #saves_per_epoch: 1
101
+ save_steps: 500
102
  save_total_limit: 1
103
  seed: 42
104
  debug:
105
  deepspeed:
106
 
107
+
108
  fsdp:
109
  fsdp_config:
110
  resize_token_embeddings_to_32x: true
111
 
 
 
 
 
 
112
  ```
113
 
114
  </details><br>
115
 
116
+ # ultrachat-phi-2-dpo-chatml
117
 
118
+ This model is a fine-tuned version of [AlekseyKorshuk/ultrachat-phi-2-sft-chatml](https://huggingface.co/AlekseyKorshuk/ultrachat-phi-2-sft-chatml) on the None dataset.
119
 
120
  ## Model description
121
 
 
134
  ### Training hyperparameters
135
 
136
  The following hyperparameters were used during training:
137
+ - learning_rate: 5e-07
138
+ - train_batch_size: 8
139
+ - eval_batch_size: 8
140
  - seed: 42
141
  - distributed_type: multi-GPU
142
  - num_devices: 4
143
+ - gradient_accumulation_steps: 4
144
  - total_train_batch_size: 128
145
+ - total_eval_batch_size: 32
146
  - optimizer: Adam with betas=(0.9,0.95) and epsilon=1e-05
147
  - lr_scheduler_type: cosine
148
+ - lr_scheduler_warmup_steps: 32
149
+ - training_steps: 1492
150
 
151
  ### Training results
152
 
model-00001-of-00002.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:d421773a41c4ed4bccf69e07afeb8520d746baaeee593b17c1a59469311ad6e6
3
  size 4995584848
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:37396b8bf1f4a7c7c449454aad7478e77ce2bbc222e08f5c8359e8920fb9ecdb
3
  size 4995584848
model-00002-of-00002.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:ed4cd386f4b1347bb0db7914bd84e36c76f9cc31343922e670d9cbe7e13b07b3
3
  size 563833008
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5cc8c2b8bc8603df77d58ca1162802cd00eee24ff7bd0e45df294506d9ca6e8b
3
  size 563833008
pytorch_model-00001-of-00002.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:bcc1db82364384bba0e0ce446fd6c9802d7d1c267ef4ee8851126a014cb41639
3
  size 4995685647
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8449c5668f971b92336be6e039e7130a4cac61207ec010dc43358eb08488ef54
3
  size 4995685647
pytorch_model-00002-of-00002.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:3f5034c2d3066bb7868859453966819ed69b84bb51ca159b68722ac6ec941f3f
3
  size 563840390
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f00bbeb2093a3665f2469802bca3fec2286999cb6307e16db87b7c96269f4fa4
3
  size 563840390
runs/Jan27_14-25-22_ced685704e0d/events.out.tfevents.1706365758.ced685704e0d.6214.0 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:801750c7001deab906e201bd7b5e948c9cdb8f6b8dc6d79817807168a5977fd2
3
- size 637299
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:198a29b0b8ff36d2308f88dcbc5ae37d5c4b52f04e26889e638e37b4e9ec8856
3
+ size 949581