Text classification model for coherence evaluation in German scientific texts

gbert-large-coherence_evaluation is a sequence classification model in the scientific domain in German, finetuned from the model gbert-large. It was trained using a custom annotated dataset of around 12,000 training and 3,000 test examples containing coherent and incoherent text sequences from wikipedia articles in german.

Compared to the base version, this model achieved a slightly higher peak accuracy (95.30%) on the validation set, observed at epoch 7. However, the base model reached its lowest evaluation loss (0.2347) earlier during training, suggesting that it converges faster but may underperform slightly in terms of generalization. These findings can inform future model selection depending on whether inference efficiency or accuracy is prioritized.

Text Classification Tag Text Classification Label Description
0 INCOHERENT The text is not coherent or has any kind of cohesion.
1 COHERENT The text is coherent and cohesive.

Training

Training was conducted on a 10 epoch fine-tuning approach:

Epoch Eval Loss Eval Accuracy
1.0 0.3411 0.9254
2.0 0.2759 0.9450
3.0 0.3085 0.9477
4.0 0.2997 0.9494
5.0 0.3702 0.9450
6.0 0.3373 0.9497
7.0 0.3259 0.9530
8.0 0.3867 0.9490
9.0 0.4123 0.9460
10.0 0.4452 0.9437

Training was conducted using a standard Text classification objective. The model achieves an accuracy of approximately 95% on the evaluation set.

Here are the overall final metrics on the test dataset after 7 epochs of training:

  • Accuracy: 0.9530156614461847
  • Loss: 0.3258904218673706
Downloads last month
406
Safetensors
Model size
336M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for samirmsallem/gbert-large-coherence_evaluation

Finetuned
(25)
this model

Dataset used to train samirmsallem/gbert-large-coherence_evaluation

Collection including samirmsallem/gbert-large-coherence_evaluation

Evaluation results