HateBERT Fine-Tuned on Jigsaw Toxic Comments (v5)

This model is a fine-tuned version of GroNLP/hateBERT on a binary version of the Jigsaw Toxic Comment Classification Challenge dataset.

It has been fine-tuned to detect whether a comment is toxic (1) or non-toxic (0) using class-weighted Focal Loss and evaluation strategies suitable for imbalanced classification tasks.

πŸ’» Training Setup

  • Base Model: GroNLP/hateBERT
  • Dataset: Jigsaw Toxic Comment Classification Challenge
  • Binary Labeling: A comment is marked as toxic if any of the following labels is 1: toxic, severe_toxic, obscene, threat, insult, identity_hate
  • Tokenizer Max Length: 256
  • Loss Function: Focal Loss with class weights
  • Hardware: NVIDIA H100 GPU (via SLURM on TU Berlin HPC)
  • Training Time: ~6 hours
  • Final F1 Score (Validation): 0.850

πŸ“Š Evaluation Metrics

Metric Value
F1 Score 0.850
Accuracy ~0.84
Confusion Matrix & PR Curves [Saved and visualized during training]

πŸ§ͺ How to Use

from transformers import AutoModelForSequenceClassification, AutoTokenizer, pipeline

model = AutoModelForSequenceClassification.from_pretrained("Jensvollends/hatebert-finetuned_v5")
tokenizer = AutoTokenizer.from_pretrained("Jensvollends/hatebert-finetuned_v5")

pipe = pipeline("text-classification", model=model, tokenizer=tokenizer, top_k=None)

text = "You are a kind person"
result = pipe(text)
print(result)
Downloads last month
28
Safetensors
Model size
109M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for Jensvollends/hatebert-finetuned_v5

Base model

GroNLP/hateBERT
Finetuned
(16)
this model