jd0g/Mistral-7B-NLI-v0.1

Browse files

Files changed (4) hide show

README.md +16 -16
adapter_config.json +1 -1
adapter_model.safetensors +2 -2
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [TheBloke/Mistral-7B-v0.1-GPTQ](https://huggingface.co/TheBloke/Mistral-7B-v0.1-GPTQ) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: nan
 ## Model description
@@ -35,12 +35,12 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 0.004
-- train_batch_size: 32
-- eval_batch_size: 64
 - seed: 42
 - gradient_accumulation_steps: 4
-- total_train_batch_size: 128
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 2
@@ -51,17 +51,17 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch   | Step | Validation Loss |
 |:-------------:|:-------:|:----:|:---------------:|
-| 0.6298        | 0.9950  | 149  | 0.4956          |
-| 0.4848        | 1.9967  | 299  | 0.4855          |
-| 1.4397        | 2.9983  | 449  | 2.3408          |
-| 1.4527        | 4.0     | 599  | 1.1570          |
-| 1.0505        | 4.9950  | 748  | 1.0305          |
-| 0.8713        | 5.9967  | 898  | 0.7930          |
-| 0.7679        | 6.9983  | 1048 | 0.7487          |
-| 0.7289        | 8.0     | 1198 | 0.7110          |
-| 69.2312       | 8.9950  | 1347 | nan             |
-| 300.5902      | 9.9967  | 1497 | nan             |
-| 635.9469      | 10.9449 | 1639 | nan             |
 ### Framework versions

 This model is a fine-tuned version of [TheBloke/Mistral-7B-v0.1-GPTQ](https://huggingface.co/TheBloke/Mistral-7B-v0.1-GPTQ) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.4930
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 0.0003
+- train_batch_size: 8
+- eval_batch_size: 16
 - seed: 42
 - gradient_accumulation_steps: 4
+- total_train_batch_size: 32
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 2
 | Training Loss | Epoch   | Step | Validation Loss |
 |:-------------:|:-------:|:----:|:---------------:|
+| 0.4947        | 0.9996  | 598  | 0.4534          |
+| 0.4418        | 1.9992  | 1196 | 0.4475          |
+| 0.4262        | 2.9987  | 1794 | 0.4476          |
+| 0.4125        | 4.0     | 2393 | 0.4499          |
+| 0.4015        | 4.9996  | 2991 | 0.4552          |
+| 0.3908        | 5.9992  | 3589 | 0.4591          |
+| 0.3809        | 6.9987  | 4187 | 0.4653          |
+| 0.3712        | 8.0     | 4786 | 0.4721          |
+| 0.3635        | 8.9996  | 5384 | 0.4783          |
+| 0.3562        | 9.9992  | 5982 | 0.4868          |
+| 0.3496        | 10.9954 | 6578 | 0.4930          |
 ### Framework versions

adapter_config.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
   "alpha_pattern": {},
   "auto_mapping": null,
-  "base_model_name_or_path": "TheBloke/Mistral-7B-v0.1-GPTQ",
   "bias": "none",
   "fan_in_fan_out": false,
   "inference_mode": true,

 {
   "alpha_pattern": {},
   "auto_mapping": null,
+  "base_model_name_or_path": null,
   "bias": "none",
   "fan_in_fan_out": false,
   "inference_mode": true,

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:2431f5db544e89f150cbfc5fb5f3f7107b5e98c7cc471cb5ebd53671ed35e0be
-size 8397056

 version https://git-lfs.github.com/spec/v1
+oid sha256:1a74d048a07df7223efdc5042731308c2707d0a1a6a21aef7b7dc348e1ec7eec
+size 8402496

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:ed3469d0a4130fecdef2e2b3973ff93260321f256cf55cc61ebe2f0d3f68cb4c
 size 4539

 version https://git-lfs.github.com/spec/v1
+oid sha256:85b86abb05ef9079982cd583fd138f45e13fe8ab0cce47062420dd7aae689ddd
 size 4539