bert-philosophy-adapted

This model is a fine-tuned version of bert-base-uncased on the Standford Encylcopedia of Philosophy dataset, using masked language modeling. It achieves the following results on the evaluation set:

  • Loss: 1.5044

Model description

This model was trained with the intention of creating a BERT encoder model for philosophical terminology, and further training on downstream tasks such as school of philosophy text classification.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 32
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 3
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
2.0568 0.1020 500 1.8821
1.9169 0.2039 1000 1.7939
1.873 0.3059 1500 1.7593
1.8408 0.4078 2000 1.7280
1.8461 0.5098 2500 1.7069
1.8108 0.6117 3000 1.6899
1.7959 0.7137 3500 1.6748
1.7771 0.8157 4000 1.6490
1.7705 0.9176 4500 1.6371
1.725 1.0196 5000 1.6317
1.707 1.1215 5500 1.6279
1.7127 1.2235 6000 1.6100
1.6806 1.3254 6500 1.5978
1.6809 1.4274 7000 1.5920
1.6766 1.5294 7500 1.5831
1.6598 1.6313 8000 1.5748
1.6632 1.7333 8500 1.5646
1.6433 1.8352 9000 1.5554
1.6317 1.9372 9500 1.5552
1.6141 2.0392 10000 1.5404
1.6328 2.1411 10500 1.5393
1.5981 2.2431 11000 1.5330
1.6192 2.3450 11500 1.5260
1.6051 2.4470 12000 1.5198
1.6218 2.5489 12500 1.5162
1.5721 2.6509 13000 1.5079
1.5656 2.7529 13500 1.5109
1.5642 2.8548 14000 1.5077
1.5715 2.9568 14500 1.5106

Framework versions

  • Transformers 4.52.4
  • Pytorch 2.6.0+cu124
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
67
Safetensors
Model size
110M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for maximuspowers/bert-philosophy-adapted

Finetuned
(5391)
this model
Finetunes
1 model

Dataset used to train maximuspowers/bert-philosophy-adapted

Collection including maximuspowers/bert-philosophy-adapted