Multilingual Hate Speech Classifier for Social Media with Disagreement-Aware Training
A multilingual XLM-R-based (100 languages) hate speech classification model fine-tuned on English, Italian and Slovenian with inter-annotator disagreement-aware training.
The details of the model and the disagreement-aware training are described in our paper:
@inproceedings{
grigor2024multilingual,
title={Multilingual Hate Speech Modeling by Leveraging Inter-Annotator Disagreement},
author={Grigor, Patricia-Carla and Evkoski, Bojan and Kralj Novak, Petra},
url={http://dx.doi.org/10.70314/is.2024.sikdd.7},
DOI={10.70314/is.2024.sikdd.7},
booktitle={Proceedings of Data Mining and Data Warehouses – Sikdd 2024},
publisher={Jožef Stefan Instutute},
year={2024}
}
Authors: Patricia-Carla Grigor, Bojan Evkoski, Petra Kralj Novak
Data available here: English; Italian; Slovenian
Model output The model classifies each input into one of four distinct classes:
- 0 - appropriate
- 1 - inappropriate
- 2 - offensive
- 3 - violent
Training data*
- 51k English Youtube comments
- 60k Italian Youtube comments
- 50k Slovenian Twitter comments
Evaluation data*
- 10k English Youtube comments
- 10k Italian Youtube comments
- 10k Slovenian Twitter comments
* each comment is manually labeled by two different annotators
Fine-tuning hyperparameters
num_train_epochs=3,
train_batch_size=8,
learning_rate=6e-6
Evaluation Results Model agreement (accuracy) vs. Inter-annotator agreement (0 - no agreement; 100 - perfect agreement):
Model-annotator Agreement | Inter-annotator Agreement | |
---|---|---|
English | 79.97 | 82.91 |
Italian | 82.00 | 81.79 |
Slovenian | 78.84 | 79.43 |
Class-specific model F1-scores:
Appropriate | Inappropriate | Offensive | Violent | |
---|---|---|---|---|
English | 86.10 | 39.16 | 68.24 | 27.82 |
Italian | 89.77 | 58.45 | 60.42 | 44.97 |
Slovenian | 84.30 | 45.22 | 69.69 | 24.79 |
Usage
from transformers import AutoModelForSequenceClassification, TextClassificationPipeline, AutoTokenizer, AutoConfig
MODEL = "IMSyPP/hate_speech_multilingual"
tokenizer = AutoTokenizer.from_pretrained(MODEL)
config = AutoConfig.from_pretrained(MODEL)
model = AutoModelForSequenceClassification.from_pretrained(MODEL)
pipe = TextClassificationPipeline(model=model, tokenizer=tokenizer, return_all_scores=True,
task='sentiment_analysis', device=0, function_to_apply="none")
pipe([
"Thank you for using our model",
"Grazie per aver utilizzato il nostro modello"
"Hvala za uporabo našega modela"
])
- Downloads last month
- 333
Model tree for IMSyPP/hate_speech_multilingual
Base model
FacebookAI/xlm-roberta-large