amine commited on
Commit
e6b27c5
1 Parent(s): dbce128

docs(README): add limitations

Browse files
Files changed (1) hide show
  1. README.md +4 -0
README.md CHANGED
@@ -44,6 +44,10 @@ model = BertForMaskedLM.from_pretrained("alger-ia/dziribert")
44
 
45
  You can find a fine-tuning script in our Github repo: https://github.com/alger-ia/dziribert
46
 
 
 
 
 
47
  ### How to cite
48
 
49
  ```bibtex
 
44
 
45
  You can find a fine-tuning script in our Github repo: https://github.com/alger-ia/dziribert
46
 
47
+ ## Limitations
48
+
49
+ The pre-training data used in this project comes from social media (Twitter). Therefore, the Masked Language Modeling objective may predict offensive words in some situations. Modeling this kind of words may be either an advantage (e.g. when training a hate speech model) or a disadvantage (e.g. when generating answers that are directly sent to the end user). Depending on your downstream task, you may need to filter out such words especially when returning automatically generated text to the end user.
50
+
51
  ### How to cite
52
 
53
  ```bibtex