FatCat87 commited on
Commit
f9ee75a
·
verified ·
1 Parent(s): 8ae0001

End of training

Browse files
Files changed (2) hide show
  1. README.md +19 -21
  2. adapter_model.bin +2 -2
README.md CHANGED
@@ -4,9 +4,9 @@ library_name: peft
4
  tags:
5
  - axolotl
6
  - generated_from_trainer
7
- base_model: princeton-nlp/Sheared-LLaMA-1.3B
8
  model-index:
9
- - name: 76e772b1-fe02-45e3-aa5b-7b36ec7abf8d
10
  results: []
11
  ---
12
 
@@ -19,19 +19,19 @@ should probably proofread and complete it, then remove this comment. -->
19
  axolotl version: `0.4.1`
20
  ```yaml
21
  adapter: lora
22
- base_model: princeton-nlp/Sheared-LLaMA-1.3B
23
  bf16: auto
24
  datasets:
25
  - data_files:
26
- - cb26f8bb8a47c11f_train_data.json
27
  ds_type: json
28
  format: custom
29
- path: cb26f8bb8a47c11f_train_data.json
30
  type:
31
  field: null
32
  field_input: null
33
- field_instruction: instruction
34
- field_output: output
35
  field_system: null
36
  format: null
37
  no_input_format: null
@@ -51,7 +51,7 @@ fsdp_config: null
51
  gradient_accumulation_steps: 4
52
  gradient_checkpointing: true
53
  group_by_length: false
54
- hub_model_id: FatCat87/76e772b1-fe02-45e3-aa5b-7b36ec7abf8d
55
  learning_rate: 0.0002
56
  load_in_4bit: false
57
  load_in_8bit: true
@@ -73,8 +73,7 @@ sample_packing: true
73
  saves_per_epoch: 1
74
  seed: 701
75
  sequence_len: 4096
76
- special_tokens:
77
- pad_token: </s>
78
  strict: false
79
  tf32: false
80
  tokenizer_type: AutoTokenizer
@@ -83,9 +82,9 @@ val_set_size: 0.1
83
  wandb_entity: fatcat87-taopanda
84
  wandb_log_model: null
85
  wandb_mode: online
86
- wandb_name: 76e772b1-fe02-45e3-aa5b-7b36ec7abf8d
87
  wandb_project: subnet56
88
- wandb_runid: 76e772b1-fe02-45e3-aa5b-7b36ec7abf8d
89
  wandb_watch: null
90
  warmup_ratio: 0.05
91
  weight_decay: 0.0
@@ -95,12 +94,12 @@ xformers_attention: null
95
 
96
  </details><br>
97
 
98
- [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/fatcat87-taopanda/subnet56/runs/edodtlvp)
99
- # 76e772b1-fe02-45e3-aa5b-7b36ec7abf8d
100
 
101
- This model is a fine-tuned version of [princeton-nlp/Sheared-LLaMA-1.3B](https://huggingface.co/princeton-nlp/Sheared-LLaMA-1.3B) on the None dataset.
102
  It achieves the following results on the evaluation set:
103
- - Loss: 1.5490
104
 
105
  ## Model description
106
 
@@ -130,17 +129,16 @@ The following hyperparameters were used during training:
130
  - total_eval_batch_size: 4
131
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
132
  - lr_scheduler_type: cosine
133
- - lr_scheduler_warmup_steps: 9
134
  - num_epochs: 1
135
 
136
  ### Training results
137
 
138
  | Training Loss | Epoch | Step | Validation Loss |
139
  |:-------------:|:------:|:----:|:---------------:|
140
- | 1.8322 | 0.0050 | 1 | 1.9699 |
141
- | 1.5589 | 0.2537 | 51 | 1.6483 |
142
- | 1.4753 | 0.5075 | 102 | 1.5744 |
143
- | 1.4788 | 0.7612 | 153 | 1.5490 |
144
 
145
 
146
  ### Framework versions
 
4
  tags:
5
  - axolotl
6
  - generated_from_trainer
7
+ base_model: Qwen/Qwen2.5-1.5B
8
  model-index:
9
+ - name: 1a4c1a98-2c44-4d6c-a706-eb21745cbeb9
10
  results: []
11
  ---
12
 
 
19
  axolotl version: `0.4.1`
20
  ```yaml
21
  adapter: lora
22
+ base_model: Qwen/Qwen2.5-1.5B
23
  bf16: auto
24
  datasets:
25
  - data_files:
26
+ - 3aa8598218912f7b_train_data.json
27
  ds_type: json
28
  format: custom
29
+ path: 3aa8598218912f7b_train_data.json
30
  type:
31
  field: null
32
  field_input: null
33
+ field_instruction: query
34
+ field_output: text
35
  field_system: null
36
  format: null
37
  no_input_format: null
 
51
  gradient_accumulation_steps: 4
52
  gradient_checkpointing: true
53
  group_by_length: false
54
+ hub_model_id: FatCat87/1a4c1a98-2c44-4d6c-a706-eb21745cbeb9
55
  learning_rate: 0.0002
56
  load_in_4bit: false
57
  load_in_8bit: true
 
73
  saves_per_epoch: 1
74
  seed: 701
75
  sequence_len: 4096
76
+ special_tokens: null
 
77
  strict: false
78
  tf32: false
79
  tokenizer_type: AutoTokenizer
 
82
  wandb_entity: fatcat87-taopanda
83
  wandb_log_model: null
84
  wandb_mode: online
85
+ wandb_name: 1a4c1a98-2c44-4d6c-a706-eb21745cbeb9
86
  wandb_project: subnet56
87
+ wandb_runid: 1a4c1a98-2c44-4d6c-a706-eb21745cbeb9
88
  wandb_watch: null
89
  warmup_ratio: 0.05
90
  weight_decay: 0.0
 
94
 
95
  </details><br>
96
 
97
+ [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/fatcat87-taopanda/subnet56/runs/2tuw4i2f)
98
+ # 1a4c1a98-2c44-4d6c-a706-eb21745cbeb9
99
 
100
+ This model is a fine-tuned version of [Qwen/Qwen2.5-1.5B](https://huggingface.co/Qwen/Qwen2.5-1.5B) on the None dataset.
101
  It achieves the following results on the evaluation set:
102
+ - Loss: 2.4703
103
 
104
  ## Model description
105
 
 
129
  - total_eval_batch_size: 4
130
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
131
  - lr_scheduler_type: cosine
 
132
  - num_epochs: 1
133
 
134
  ### Training results
135
 
136
  | Training Loss | Epoch | Step | Validation Loss |
137
  |:-------------:|:------:|:----:|:---------------:|
138
+ | 2.724 | 0.0952 | 1 | 2.7997 |
139
+ | 2.6075 | 0.2857 | 3 | 2.6428 |
140
+ | 2.4169 | 0.5714 | 6 | 2.4890 |
141
+ | 2.3835 | 0.8571 | 9 | 2.4703 |
142
 
143
 
144
  ### Framework versions
adapter_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:d0ba8c4ee480fe38718ba5baef8c771892e5806bacea580ba0421ba81448e4eb
3
- size 120052362
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:30d2c742e5487f8931f8d076fcadff6b2317cdbea3731425d444008f743f7695
3
+ size 147859242