mahendra0203
/

mistral-test-alpaca

Generated from Trainer

4-bit precision

Model card Files Files and versions Community

mahendra0203 commited on Jul 15, 2024

Commit

1603dcb

·

verified ·

1 Parent(s): 1edeb12

End of training

Files changed (1) hide show

README.md +6 -5

README.md CHANGED Viewed

@@ -38,7 +38,7 @@ datasets:
 dataset_prepared_path: last_run_prepared
 val_set_size: 0.1
-output_dir: ./ft-v1
 hub_model_id: mahendra0203/mistral-test-alpaca
 adapter: qlora
@@ -69,6 +69,7 @@ gradient_accumulation_steps: 8  # Increased from 4
 micro_batch_size: 4  # Reduced from 16
 eval_batch_size: 4  # Reduced from 16
 num_epochs: 2
 optimizer: adamw_bnb_8bit
 lr_scheduler: cosine
 learning_rate: 0.0002
@@ -112,12 +113,12 @@ save_safetensors: true
 </details><br>
-[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/mahendra0203/ft-alpaca-mistral-hc/runs/yp7zk4y6)
 # mistral-test-alpaca
 This model is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 1.3238
 ## Model description
@@ -145,14 +146,14 @@ The following hyperparameters were used during training:
 - optimizer: Adam with betas=(0.9,0.95) and epsilon=1e-05
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_steps: 20
-- num_epochs: 2
 ### Training results
 | Training Loss | Epoch  | Step | Validation Loss |
 |:-------------:|:------:|:----:|:---------------:|
 | 1.3818        | 0.6667 | 1    | 1.3490          |
-| 1.3841        | 1.1667 | 2    | 1.3238          |
 ### Framework versions

 dataset_prepared_path: last_run_prepared
 val_set_size: 0.1
+output_dir: ./ft-v2
 hub_model_id: mahendra0203/mistral-test-alpaca
 adapter: qlora
 micro_batch_size: 4  # Reduced from 16
 eval_batch_size: 4  # Reduced from 16
 num_epochs: 2
+max_steps: 1000
 optimizer: adamw_bnb_8bit
 lr_scheduler: cosine
 learning_rate: 0.0002
 </details><br>
+[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/mahendra0203/ft-alpaca-mistral-hc/runs/78qqsr2h)
 # mistral-test-alpaca
 This model is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.3251
 ## Model description
 - optimizer: Adam with betas=(0.9,0.95) and epsilon=1e-05
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_steps: 20
+- training_steps: 2
 ### Training results
 | Training Loss | Epoch  | Step | Validation Loss |
 |:-------------:|:------:|:----:|:---------------:|
 | 1.3818        | 0.6667 | 1    | 1.3490          |
+| 1.3841        | 1.1667 | 2    | 1.3251          |
 ### Framework versions