metadata

library_name: transformers
license: apache-2.0
base_model: bert-base-uncased
tags:
  - generated_from_trainer
model-index:
  - name: bert-philosophy-adapted
    results: []
datasets:
  - AiresPucrs/stanford-encyclopedia-philosophy
language:
  - en
pipeline_tag: fill-mask

bert-philosophy-adapted

This model is a fine-tuned version of bert-base-uncased on the Standford Encylcopedia of Philosophy dataset, using masked language modeling. It achieves the following results on the evaluation set:

Loss: 1.5044

Model description

This model was trained with the intention of creating a BERT encoder model for philosophical terminology, and further training on downstream tasks such as school of philosophy text classification.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 32
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
num_epochs: 3
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss
2.0568	0.1020	500	1.8821
1.9169	0.2039	1000	1.7939
1.873	0.3059	1500	1.7593
1.8408	0.4078	2000	1.7280
1.8461	0.5098	2500	1.7069
1.8108	0.6117	3000	1.6899
1.7959	0.7137	3500	1.6748
1.7771	0.8157	4000	1.6490
1.7705	0.9176	4500	1.6371
1.725	1.0196	5000	1.6317
1.707	1.1215	5500	1.6279
1.7127	1.2235	6000	1.6100
1.6806	1.3254	6500	1.5978
1.6809	1.4274	7000	1.5920
1.6766	1.5294	7500	1.5831
1.6598	1.6313	8000	1.5748
1.6632	1.7333	8500	1.5646
1.6433	1.8352	9000	1.5554
1.6317	1.9372	9500	1.5552
1.6141	2.0392	10000	1.5404
1.6328	2.1411	10500	1.5393
1.5981	2.2431	11000	1.5330
1.6192	2.3450	11500	1.5260
1.6051	2.4470	12000	1.5198
1.6218	2.5489	12500	1.5162
1.5721	2.6509	13000	1.5079
1.5656	2.7529	13500	1.5109
1.5642	2.8548	14000	1.5077
1.5715	2.9568	14500	1.5106

Framework versions

Transformers 4.52.4
Pytorch 2.6.0+cu124
Datasets 3.6.0
Tokenizers 0.21.1