de_childes_30 / README.md
fpadovani's picture
Model save
563aad5 verified
metadata
library_name: transformers
tags:
  - generated_from_trainer
model-index:
  - name: de_childes_30
    results: []

de_childes_30

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.6056

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 30
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 40000
  • training_steps: 100000

Training results

Training Loss Epoch Step Validation Loss
No log 1.5021 2000 7.4721
7.3851 3.0041 4000 6.4176
7.3851 4.5062 6000 6.3026
6.0706 6.0083 8000 6.2009
6.0706 7.5103 10000 6.1058
5.8733 9.0124 12000 6.0330
5.8733 10.5145 14000 5.9681
5.723 12.0165 16000 5.8789
5.723 13.5186 18000 5.8198
5.6127 15.0207 20000 5.8021
5.6127 16.5227 22000 5.7662
5.5325 18.0248 24000 5.7319
5.5325 19.5268 26000 5.7032
5.4632 21.0289 28000 5.7046
5.4632 22.5310 30000 5.5708
5.2633 24.0330 32000 4.9945
5.2633 25.5351 34000 4.5232
4.3386 27.0372 36000 4.1401
4.3386 28.5392 38000 3.8578
3.6823 30.0413 40000 3.6679
3.6823 31.5434 42000 3.5397
3.3424 33.0454 44000 3.4013
3.3424 34.5475 46000 3.3154
3.1266 36.0496 48000 3.2563
3.1266 37.5516 50000 3.1665
2.9693 39.0537 52000 3.1341
2.9693 40.5558 54000 3.0564
2.8544 42.0578 56000 3.0329
2.8544 43.5599 58000 2.9552
2.758 45.0620 60000 2.9492
2.758 46.5640 62000 2.8937
2.684 48.0661 64000 2.8662
2.684 49.5682 66000 2.8495
2.6223 51.0702 68000 2.8253
2.6223 52.5723 70000 2.7846
2.5704 54.0744 72000 2.7910
2.5704 55.5764 74000 2.7503
2.5235 57.0785 76000 2.7392
2.5235 58.5805 78000 2.7223
2.4865 60.0826 80000 2.7142
2.4865 61.5847 82000 2.7068
2.4549 63.0867 84000 2.7050
2.4549 64.5888 86000 2.6745
2.427 66.0909 88000 2.6638
2.427 67.5929 90000 2.6596
2.4092 69.0950 92000 2.6478
2.4092 70.5971 94000 2.6531
2.3873 72.0991 96000 2.6315
2.3873 73.6012 98000 2.6405
2.3734 75.1033 100000 2.6056

Framework versions

  • Transformers 4.45.2
  • Pytorch 2.5.1+cu124
  • Datasets 3.0.1
  • Tokenizers 0.20.1