alger-ia
/

dziribert

Inference Endpoints

Model card Files Files and versions Community

amine commited on Dec 28, 2022

Commit

e6b27c5

•

1 Parent(s): dbce128

docs(README): add limitations

Files changed (1) hide show

README.md +4 -0

README.md CHANGED Viewed

@@ -44,6 +44,10 @@ model = BertForMaskedLM.from_pretrained("alger-ia/dziribert")
 You can find a fine-tuning script in our Github repo: https://github.com/alger-ia/dziribert
 ### How to cite
 ```bibtex

 You can find a fine-tuning script in our Github repo: https://github.com/alger-ia/dziribert
+## Limitations
+The pre-training data used in this project comes from social media (Twitter). Therefore, the Masked Language Modeling objective may predict offensive words in some situations. Modeling this kind of words may be either an advantage (e.g. when training a hate speech model) or a disadvantage (e.g. when generating answers that are directly sent to the end user). Depending on your downstream task, you may need to filter out such words especially when returning automatically generated text to the end user.
 ### How to cite
 ```bibtex