Swephoenix
/

phi4-lora-xaji0y6d-1742330134

PEFT

Safetensors

Generated from Trainer

Model card Files Files and versions Community

Swephoenix commited on Mar 18

Commit

ccbfc39

verified ·

1 Parent(s): 728de22

Model save

Browse files

Files changed (1) hide show

README.md +93 -0

README.md ADDED Viewed

	@@ -0,0 +1,93 @@

+---
+library_name: peft
+license: mit
+base_model: microsoft/Phi-4-mini-instruct
+tags:
+- generated_from_trainer
+model-index:
+- name: phi4-lora-xaji0y6d-1742330134
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# phi4-lora-xaji0y6d-1742330134
+This model is a fine-tuned version of [microsoft/Phi-4-mini-instruct](https://huggingface.co/microsoft/Phi-4-mini-instruct) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 1.0010
+- Perplexity: 2.7209
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 5e-06
+- train_batch_size: 1
+- eval_batch_size: 1
+- seed: 42
+- gradient_accumulation_steps: 16
+- total_train_batch_size: 16
+- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
+- lr_scheduler_type: linear
+- lr_scheduler_warmup_ratio: 0.01
+- num_epochs: 50
+- mixed_precision_training: Native AMP
+### Training results
+| Training Loss | Epoch | Step | Validation Loss | Perplexity |
+|:-------------:|:-----:|:----:|:---------------:|:----------:|
+| 5.6626        | 1.48  | 10   | 5.8212          | 337.3485   |
+| 5.4363        | 2.96  | 20   | 5.4409          | 230.6381   |
+| 5.2185        | 4.32  | 30   | 5.2027          | 181.7434   |
+| 4.9729        | 5.8   | 40   | 4.9270          | 137.9507   |
+| 4.68          | 7.16  | 50   | 4.6071          | 100.1871   |
+| 4.3242        | 8.64  | 60   | 4.2787          | 72.1430    |
+| 4.0147        | 10.0  | 70   | 3.9536          | 52.1171    |
+| 3.7066        | 11.48 | 80   | 3.6597          | 38.8469    |
+| 3.3654        | 12.96 | 90   | 3.3835          | 29.4712    |
+| 3.1883        | 14.32 | 100  | 3.1183          | 22.6075    |
+| 2.8444        | 15.8  | 110  | 2.8578          | 17.4224    |
+| 2.6168        | 17.16 | 120  | 2.6088          | 13.5819    |
+| 2.3689        | 18.64 | 130  | 2.3749          | 10.7493    |
+| 2.1379        | 20.0  | 140  | 2.1532          | 8.6119     |
+| 1.8909        | 21.48 | 150  | 1.9458          | 6.9986     |
+| 1.7022        | 22.96 | 160  | 1.7602          | 5.8135     |
+| 1.5127        | 24.32 | 170  | 1.6061          | 4.9831     |
+| 1.3942        | 25.8  | 180  | 1.4847          | 4.4133     |
+| 1.3053        | 27.16 | 190  | 1.3923          | 4.0240     |
+| 1.2177        | 28.64 | 200  | 1.3193          | 3.7405     |
+| 1.1161        | 30.0  | 210  | 1.2557          | 3.5101     |
+| 1.1293        | 31.48 | 220  | 1.2023          | 3.3275     |
+| 1.0622        | 32.96 | 230  | 1.1562          | 3.1778     |
+| 1.015         | 34.32 | 240  | 1.1164          | 3.0536     |
+| 0.9539        | 35.8  | 250  | 1.0830          | 2.9533     |
+| 0.9387        | 37.16 | 260  | 1.0552          | 2.8725     |
+| 0.8819        | 38.64 | 270  | 1.0340          | 2.8121     |
+| 0.9162        | 40.0  | 280  | 1.0178          | 2.7670     |
+| 0.8912        | 41.48 | 290  | 1.0074          | 2.7384     |
+| 0.8641        | 42.96 | 300  | 1.0010          | 2.7209     |
+### Framework versions
+- PEFT 0.14.0
+- Transformers 4.48.2
+- Pytorch 2.1.0+cu118
+- Datasets 3.4.1
+- Tokenizers 0.21.1