|
--- |
|
language: |
|
- en |
|
library_name: pysentimiento |
|
|
|
tags: |
|
- twitter |
|
- hate-speech |
|
|
|
--- |
|
# Hate Speech detection in English |
|
## bertweet-hate-speech |
|
|
|
Repository: [https://github.com/pysentimiento/pysentimiento/](https://github.com/finiteautomata/pysentimiento/) |
|
|
|
|
|
|
|
Model trained with SemEval 2019 Task 5: HatEval (SubTask B) corpus for Hate Speech detection in English. Base model is [BERTweet](https://huggingface.co/vinai/bertweet-base), a RoBERTa model trained in English tweets. |
|
|
|
It is a multi-classifier model, with the following classes: |
|
|
|
- **HS**: is it hate speech? |
|
- **TR**: is it targeted to a specific individual? |
|
- **AG**: is it aggressive? |
|
|
|
|
|
## License |
|
|
|
`pysentimiento` is an open-source library for non-commercial use and scientific research purposes only. Please be aware that models are trained with third-party datasets and are subject to their respective licenses. |
|
|
|
1. [TASS Dataset license](http://tass.sepln.org/tass_data/download.php) |
|
2. [SEMEval 2017 Dataset license]() |
|
|
|
## Citation |
|
|
|
If you use this model in your work, please cite the following papers: |
|
|
|
``` |
|
@misc{perez2021pysentimiento, |
|
title={pysentimiento: A Python Toolkit for Sentiment Analysis and SocialNLP tasks}, |
|
author={Juan Manuel Pérez and Juan Carlos Giudici and Franco Luque}, |
|
year={2021}, |
|
eprint={2106.09462}, |
|
archivePrefix={arXiv}, |
|
primaryClass={cs.CL} |
|
} |
|
|
|
@inproceedings{nguyen2020bertweet, |
|
title={BERTweet: A pre-trained language model for English Tweets}, |
|
author={Nguyen, Dat Quoc and Vu, Thanh and Nguyen, Anh Tuan}, |
|
booktitle={Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations}, |
|
pages={9--14}, |
|
year={2020} |
|
} |
|
|
|
@inproceedings{basile2019semeval, |
|
title={Semeval-2019 task 5: Multilingual detection of hate speech against immigrants and women in twitter}, |
|
author={Basile, Valerio and Bosco, Cristina and Fersini, Elisabetta and Nozza, Debora and Patti, Viviana and Pardo, Francisco Manuel Rangel and Rosso, Paolo and Sanguinetti, Manuela}, |
|
booktitle={Proceedings of the 13th international workshop on semantic evaluation}, |
|
pages={54--63}, |
|
year={2019} |
|
} |
|
``` |
|
Enjoy! 🤗 |
|
|