mahendra0203 commited on
Commit
1603dcb
·
verified ·
1 Parent(s): 1edeb12

End of training

Browse files
Files changed (1) hide show
  1. README.md +6 -5
README.md CHANGED
@@ -38,7 +38,7 @@ datasets:
38
 
39
  dataset_prepared_path: last_run_prepared
40
  val_set_size: 0.1
41
- output_dir: ./ft-v1
42
  hub_model_id: mahendra0203/mistral-test-alpaca
43
 
44
  adapter: qlora
@@ -69,6 +69,7 @@ gradient_accumulation_steps: 8 # Increased from 4
69
  micro_batch_size: 4 # Reduced from 16
70
  eval_batch_size: 4 # Reduced from 16
71
  num_epochs: 2
 
72
  optimizer: adamw_bnb_8bit
73
  lr_scheduler: cosine
74
  learning_rate: 0.0002
@@ -112,12 +113,12 @@ save_safetensors: true
112
 
113
  </details><br>
114
 
115
- [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/mahendra0203/ft-alpaca-mistral-hc/runs/yp7zk4y6)
116
  # mistral-test-alpaca
117
 
118
  This model is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on the None dataset.
119
  It achieves the following results on the evaluation set:
120
- - Loss: 1.3238
121
 
122
  ## Model description
123
 
@@ -145,14 +146,14 @@ The following hyperparameters were used during training:
145
  - optimizer: Adam with betas=(0.9,0.95) and epsilon=1e-05
146
  - lr_scheduler_type: cosine
147
  - lr_scheduler_warmup_steps: 20
148
- - num_epochs: 2
149
 
150
  ### Training results
151
 
152
  | Training Loss | Epoch | Step | Validation Loss |
153
  |:-------------:|:------:|:----:|:---------------:|
154
  | 1.3818 | 0.6667 | 1 | 1.3490 |
155
- | 1.3841 | 1.1667 | 2 | 1.3238 |
156
 
157
 
158
  ### Framework versions
 
38
 
39
  dataset_prepared_path: last_run_prepared
40
  val_set_size: 0.1
41
+ output_dir: ./ft-v2
42
  hub_model_id: mahendra0203/mistral-test-alpaca
43
 
44
  adapter: qlora
 
69
  micro_batch_size: 4 # Reduced from 16
70
  eval_batch_size: 4 # Reduced from 16
71
  num_epochs: 2
72
+ max_steps: 1000
73
  optimizer: adamw_bnb_8bit
74
  lr_scheduler: cosine
75
  learning_rate: 0.0002
 
113
 
114
  </details><br>
115
 
116
+ [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/mahendra0203/ft-alpaca-mistral-hc/runs/78qqsr2h)
117
  # mistral-test-alpaca
118
 
119
  This model is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on the None dataset.
120
  It achieves the following results on the evaluation set:
121
+ - Loss: 1.3251
122
 
123
  ## Model description
124
 
 
146
  - optimizer: Adam with betas=(0.9,0.95) and epsilon=1e-05
147
  - lr_scheduler_type: cosine
148
  - lr_scheduler_warmup_steps: 20
149
+ - training_steps: 2
150
 
151
  ### Training results
152
 
153
  | Training Loss | Epoch | Step | Validation Loss |
154
  |:-------------:|:------:|:----:|:---------------:|
155
  | 1.3818 | 0.6667 | 1 | 1.3490 |
156
+ | 1.3841 | 1.1667 | 2 | 1.3251 |
157
 
158
 
159
  ### Framework versions