HateBERT Fine-Tuned on Jigsaw Toxic Comments (v5)

This model is a fine-tuned version of GroNLP/hateBERT on a binary version of the Jigsaw Toxic Comment Classification Challenge dataset.

It has been fine-tuned to detect whether a comment is toxic (1) or non-toxic (0) using class-weighted Focal Loss and evaluation strategies suitable for imbalanced classification tasks.

💻 Training Setup

Base Model: GroNLP/hateBERT
Dataset: Jigsaw Toxic Comment Classification Challenge
Binary Labeling: A comment is marked as toxic if any of the following labels is 1: toxic, severe_toxic, obscene, threat, insult, identity_hate
Tokenizer Max Length: 256
Loss Function: Focal Loss with class weights
Hardware: NVIDIA H100 GPU (via SLURM on TU Berlin HPC)
Training Time: ~6 hours
Final F1 Score (Validation): 0.850

📊 Evaluation Metrics

Metric	Value
F1 Score	0.850
Accuracy	~0.84
Confusion Matrix & PR Curves	[Saved and visualized during training]

🧪 How to Use

from transformers import AutoModelForSequenceClassification, AutoTokenizer, pipeline

model = AutoModelForSequenceClassification.from_pretrained("Jensvollends/hatebert-finetuned_v5")
tokenizer = AutoTokenizer.from_pretrained("Jensvollends/hatebert-finetuned_v5")

pipe = pipeline("text-classification", model=model, tokenizer=tokenizer, top_k=None)

text = "You are a kind person"
result = pipe(text)
print(result)

Jensvollends
/

hatebert-finetuned_v5

HateBERT Fine-Tuned on Jigsaw Toxic Comments (v5)

💻 Training Setup

📊 Evaluation Metrics

🧪 How to Use

Model tree for Jensvollends/hatebert-finetuned_v5