qwen_new_mage_per_domain_balanced_1.5

Files changed (5) hide show

README.md CHANGED Viewed

@@ -18,8 +18,8 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [Qwen/Qwen1.5-1.8B](https://huggingface.co/Qwen/Qwen1.5-1.8B) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.2937
-- Accuracy: 0.8863
 ## Model description
@@ -38,22 +38,28 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 2e-05
 - train_batch_size: 32
 - eval_batch_size: 32
 - seed: 42
 - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
-- num_epochs: 3
 ### Training results
 | Training Loss | Epoch  | Step | Validation Loss | Accuracy |
 |:-------------:|:------:|:----:|:---------------:|:--------:|
-| 0.7355        | 0.0183 | 100  | 0.2731          | 0.8854   |
-| 0.3892        | 0.0366 | 200  | 0.4133          | 0.8156   |
-| 0.3087        | 0.0549 | 300  | 0.4579          | 0.8433   |
-| 0.2798        | 0.0732 | 400  | 0.2937          | 0.8863   |
 ### Framework versions

 This model is a fine-tuned version of [Qwen/Qwen1.5-1.8B](https://huggingface.co/Qwen/Qwen1.5-1.8B) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.0842
+- Accuracy: 0.9714
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 5e-06
 - train_batch_size: 32
 - eval_batch_size: 32
 - seed: 42
 - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
+- num_epochs: 1
 ### Training results
 | Training Loss | Epoch  | Step | Validation Loss | Accuracy |
 |:-------------:|:------:|:----:|:---------------:|:--------:|
+| 0.1807        | 0.0915 | 500  | 0.3355          | 0.8881   |
+| 0.1387        | 0.1831 | 1000 | 0.1860          | 0.9293   |
+| 0.1443        | 0.2746 | 1500 | 0.1421          | 0.9418   |
+| 0.1143        | 0.3662 | 2000 | 0.1273          | 0.9517   |
+| 0.1103        | 0.4577 | 2500 | 0.1393          | 0.9490   |
+| 0.1049        | 0.5492 | 3000 | 0.1159          | 0.9606   |
+| 0.0809        | 0.6408 | 3500 | 0.1267          | 0.9526   |
+| 0.0896        | 0.7323 | 4000 | 0.1104          | 0.9606   |
+| 0.0758        | 0.8239 | 4500 | 0.1341          | 0.9633   |
+| 0.0811        | 0.9154 | 5000 | 0.0842          | 0.9714   |
 ### Framework versions

evaluation_results.json ADDED Viewed

+{
+    "eval_loss": 0.29367899894714355,
+    "eval_accuracy": 0.8863025962399284,
+    "eval_runtime": 35.1558,
+    "eval_samples_per_second": 31.773,
+    "eval_steps_per_second": 0.996,
+    "epoch": 0.07323324789454412
+}

model-00001-of-00002.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:3ce3d9b57e79624dab2def7a89cb835512538e32af92b8f75df25348a97d4b5a
 size 4955308912

 version https://git-lfs.github.com/spec/v1
+oid sha256:738a95aaa357cdeec54c5282ac1cf2afd7cef77bee594ce14c614b3407e311ae
 size 4955308912

model-00002-of-00002.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:48a602a71f21cf6f632a54a988f5d83981bd4750d12ea7b50071a0e2241ff034
 size 1147395408

 version https://git-lfs.github.com/spec/v1
+oid sha256:dee692604a64516af4f8f5ed47fd6b0b29f07b26825d9462dcb85abb9a7b1a9a
 size 1147395408

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:d9b60139369c744f986a4c30f484555e7e931535e218b54d38cac9893824e4e1
 size 5304

 version https://git-lfs.github.com/spec/v1
+oid sha256:76630ee7ca57d10ae6eb51557aa48d654231d33615c8c4c4854af4823bf72710
 size 5304