youth-sentiment-classifier model

This model is a fine-tuned version of allegro/herbert-base-cased on a jziebura/polish_youth_slang_classification dataset.

It achieves the following results on the evaluation set:

  • Loss: 0.7289
  • Accuracy: 0.7127
  • F1 weighted: 0.7110
  • F1 Macro: 0.6977

Model description

The model is part of the experiments conducted during the creation of my master's thesis titled: "A language model analyzing Polish youth slang".

It was fine-tuned to classify the sentiment of the Polish youth slang into three categories: negative, neutral or ambiguous, and positive.

Training and evaluation data

All data comes from the jziebura/polish_youth_slang_classification dataset

Training procedure

The hyperparameters were selected from those recommended in the BERT introduction paper and then optimized using the Optuna backend.

The HPO and fine-tuning were both conducted on the Google Colab platform on their free-tier T4 GPU instances.

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3.93e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss Accuracy F1 F1 Macro
No log 0 0 1.1025 0.3118 0.2848 0.2324
1.0546 0.1176 32 1.0001 0.5037 0.3374 0.2233
0.9406 0.2353 64 0.8969 0.5849 0.5654 0.5371
0.8885 0.3529 96 0.8430 0.6015 0.6074 0.6090
0.8452 0.4706 128 0.8230 0.6218 0.6173 0.5990
0.8208 0.5882 160 0.8393 0.6107 0.6125 0.5982
0.7182 0.7059 192 0.7848 0.6605 0.6504 0.6324
0.7644 0.8235 224 0.7708 0.6587 0.6516 0.6347
0.7211 0.9412 256 0.7734 0.6642 0.6440 0.6155
0.7182 1.0588 288 0.7423 0.6863 0.6761 0.6534
0.578 1.1765 320 0.7521 0.6661 0.6637 0.6503
0.6434 1.2941 352 0.7673 0.6771 0.6570 0.6373
0.5519 1.4118 384 0.8297 0.6513 0.6560 0.6548
0.5714 1.5294 416 0.7851 0.6531 0.6556 0.6472
0.583 1.6471 448 0.7941 0.6587 0.6585 0.6472
0.6426 1.7647 480 0.7596 0.6605 0.6623 0.6575
0.5681 1.8824 512 0.7831 0.6679 0.6672 0.6567
0.5424 2.0 544 0.7885 0.6439 0.6470 0.6472
0.4013 2.1176 576 0.8117 0.6771 0.6780 0.6696
0.369 2.2353 608 0.8527 0.6845 0.6856 0.6792
0.3405 2.3529 640 0.8640 0.6697 0.6652 0.6553
0.3633 2.4706 672 0.8678 0.6753 0.6682 0.6545
0.3827 2.5882 704 0.8551 0.6679 0.6649 0.6577
0.3826 2.7059 736 0.8680 0.6790 0.6770 0.6708
0.4146 2.8235 768 0.8515 0.6808 0.6801 0.6744
0.3592 2.9412 800 0.8463 0.6771 0.6762 0.6711

Framework versions

  • Transformers 4.54.0
  • Pytorch 2.6.0+cu124
  • Datasets 4.0.0
  • Tokenizers 0.21.2
Downloads last month
4
Safetensors
Model size
124M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for jziebura/youth-slang-sentiment-classifier

Finetuned
(6)
this model

Dataset used to train jziebura/youth-slang-sentiment-classifier