cardiffnlp
/

twitter-xlm-roberta-base-hate-spanish

Text Classification

Model card Files Files and versions Community

twitter-xlm-roberta-base-hate-spanish / README.md

antypasd's picture

Update README.md

ac77733 about 2 years ago

|

2.17 kB

	# cardiffnlp/twitter-xlm-roberta-base-hate-spanish

	This model is a fine-tuned version of [cardiffnlp/twitter-xlm-roberta-base](https://huggingface.co/cardiffnlp/twitter-xlm-roberta-base) using the [`HaterNet`](https://zenodo.org/record/2592149) dataset and the Spanish subset of
	[`SemEval-2019 Task 5`](https://aclanthology.org/S19-2007/).

	## Following metrics are achieved

	* `on the test split of SemEval-2019 Task 5`

	- F1 (weighted): 0.7866
	- F1 (macro): 0.7935
	- Accuracy: 0.7937

	* on custom test split of `Haternet`

	- F1 (weighted): 0.7815
	- F1 (macro): 0.6981
	- Accuracy: 0.7933

	* on `Haternet` & `SemEval-2019 Task 5`
	- F1 (weighted): 0.7908
	- F1 (macro): 0.7657
	- Accuracy: 0.7936



	### Usage
	Install tweetnlp via pip.
	```shell
	pip install tweetnlp
	```
	Load the model in python.
	```python
	import tweetnlp
	model = tweetnlp.Classifier("cardiffnlp/twitter-xlm-roberta-base-hate-spanish")
	model.predict('Ismael es egocentrico porque se vuelve loca si le dicen que tiene el pelo bonito😂😂😂😂 eso se define con otro objetivo #FirstDates251')
	>> {'label': 'NOT-HATE'}

	```



	### Datasets
	@inproceedings{basile-etal-2019-semeval,
	title = "{S}em{E}val-2019 Task 5: Multilingual Detection of Hate Speech Against Immigrants and Women in {T}witter",
	author = "Basile, Valerio and
	Bosco, Cristina and
	Fersini, Elisabetta and
	Nozza, Debora and
	Patti, Viviana and
	Rangel Pardo, Francisco Manuel and
	Rosso, Paolo and
	Sanguinetti, Manuela",
	booktitle = "Proceedings of the 13th International Workshop on Semantic Evaluation",
	month = jun,
	year = "2019",
	address = "Minneapolis, Minnesota, USA",
	publisher = "Association for Computational Linguistics",
	url = "https://aclanthology.org/S19-2007",
	doi = "10.18653/v1/S19-2007",
	pages = "54--63",
	}

	@article{quijano2019haternet,
	title={HaterNet a system for detecting and analyzing hate speech in Twitter (Version 1.0)[Data set]},
	author={Quijano-Sanchez, Lara and Kohatsu, Juan Carlos Pereira and Liberatore, Federico and Camacho-Collados, Miguel},
	journal={Zenodo},
	year={2019}
	}