jd0g/Mistral-7B-NLI-v0.1

Files changed (4) hide show

README.md CHANGED Viewed

@@ -16,7 +16,12 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [TheBloke/Mistral-7B-v0.1-GPTQ](https://huggingface.co/TheBloke/Mistral-7B-v0.1-GPTQ) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.6735
 ## Model description
@@ -35,34 +40,18 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 0.0003
-- train_batch_size: 8
-- eval_batch_size: 16
 - seed: 42
 - gradient_accumulation_steps: 4
-- total_train_batch_size: 32
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 2
 - num_epochs: 10
 - mixed_precision_training: Native AMP
-### Training results
-| Training Loss | Epoch  | Step | Validation Loss |
-|:-------------:|:------:|:----:|:---------------:|
-| 1.8799        | 0.9231 | 3    | 1.6354          |
-| 1.6206        | 1.8462 | 6    | 1.3630          |
-| 1.3224        | 2.7692 | 9    | 1.1313          |
-| 0.8177        | 4.0    | 13   | 0.9223          |
-| 0.9144        | 4.9231 | 16   | 0.8115          |
-| 0.801         | 5.8462 | 19   | 0.7444          |
-| 0.7393        | 6.7692 | 22   | 0.7097          |
-| 0.5279        | 8.0    | 26   | 0.6836          |
-| 0.6872        | 8.9231 | 29   | 0.6745          |
-| 0.4589        | 9.2308 | 30   | 0.6735          |
 ### Framework versions
 - PEFT 0.10.0

 This model is a fine-tuned version of [TheBloke/Mistral-7B-v0.1-GPTQ](https://huggingface.co/TheBloke/Mistral-7B-v0.1-GPTQ) on the None dataset.
 It achieves the following results on the evaluation set:
+- eval_loss: 1.4535
+- eval_runtime: 475.4748
+- eval_samples_per_second: 4.206
+- eval_steps_per_second: 0.067
+- epoch: 8.0
+- step: 8
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 0.0002
+- train_batch_size: 32
+- eval_batch_size: 64
 - seed: 42
 - gradient_accumulation_steps: 4
+- total_train_batch_size: 128
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 2
 - num_epochs: 10
 - mixed_precision_training: Native AMP
 ### Framework versions
 - PEFT 0.10.0

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:186a8f496eea24fe331caa52f969f5ce4ee60952d14104dc3c4e38fbb66e5026
-size 4203824

 version https://git-lfs.github.com/spec/v1
+oid sha256:0f9efc4a2328a1277d57835ea945a4f328c66b68f2d98ac92123911880ddc42b
+size 4221232

runs/Apr25_14-56-57_6e7fe67a7a94/events.out.tfevents.1714057027.6e7fe67a7a94.378.0 ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:6712b40280cfabbdf65166aee11588a6611514f7f83a42e48f2939b693de1cbc
+size 9322

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:1ccb48c341715f03da45d6376940a73d2f4b4ac536ed63d81b2b56365c52b7f6
 size 4984

 version https://git-lfs.github.com/spec/v1
+oid sha256:420c7c7a2145f6ad251e935cfe280a7cfe25af7d95116c0c7cdd77bedd01561d
 size 4984