maximuspowers's picture
Update README.md
20d60b6 verified
metadata
library_name: transformers
license: apache-2.0
base_model: bert-base-uncased
tags:
  - generated_from_trainer
model-index:
  - name: bert-philosophy-adapted
    results: []
datasets:
  - AiresPucrs/stanford-encyclopedia-philosophy
language:
  - en
pipeline_tag: fill-mask

bert-philosophy-adapted

This model is a fine-tuned version of bert-base-uncased on the Standford Encylcopedia of Philosophy dataset, using masked language modeling. It achieves the following results on the evaluation set:

  • Loss: 1.5044

Model description

This model was trained with the intention of creating a BERT encoder model for philosophical terminology, and further training on downstream tasks such as school of philosophy text classification.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 32
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 3
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
2.0568 0.1020 500 1.8821
1.9169 0.2039 1000 1.7939
1.873 0.3059 1500 1.7593
1.8408 0.4078 2000 1.7280
1.8461 0.5098 2500 1.7069
1.8108 0.6117 3000 1.6899
1.7959 0.7137 3500 1.6748
1.7771 0.8157 4000 1.6490
1.7705 0.9176 4500 1.6371
1.725 1.0196 5000 1.6317
1.707 1.1215 5500 1.6279
1.7127 1.2235 6000 1.6100
1.6806 1.3254 6500 1.5978
1.6809 1.4274 7000 1.5920
1.6766 1.5294 7500 1.5831
1.6598 1.6313 8000 1.5748
1.6632 1.7333 8500 1.5646
1.6433 1.8352 9000 1.5554
1.6317 1.9372 9500 1.5552
1.6141 2.0392 10000 1.5404
1.6328 2.1411 10500 1.5393
1.5981 2.2431 11000 1.5330
1.6192 2.3450 11500 1.5260
1.6051 2.4470 12000 1.5198
1.6218 2.5489 12500 1.5162
1.5721 2.6509 13000 1.5079
1.5656 2.7529 13500 1.5109
1.5642 2.8548 14000 1.5077
1.5715 2.9568 14500 1.5106

Framework versions

  • Transformers 4.52.4
  • Pytorch 2.6.0+cu124
  • Datasets 3.6.0
  • Tokenizers 0.21.1