sana-ngu commited on
Commit
71a111d
ยท
1 Parent(s): 1d8538e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -3
README.md CHANGED
@@ -1,16 +1,26 @@
1
  ### HaT5(T5-base)
2
  This is a fine-tuned model of T5 (base) on the hate speech detection dataset. It is intended to be used as a classification model for identifying Tweets (0 - HOF(hate/offensive); 1 - NOT). The task prefix we used for the T5 model is 'classification: '.
3
 
4
- The dataset it's trained on is limited in scope, as it covers only some news texts covering about 20 English-speaking countries. The macro F1 score achieved on the test set, based on the official evaluation, is 0.5452. More information about the original pre-trained model can be found here
5
 
6
  Classification examples:
7
 
 
 
 
 
 
 
 
 
8
  from transformers import T5ForConditionalGeneration, T5Tokenizer
9
  import torch
10
  model = T5ForConditionalGeneration.from_pretrained("sana-ngu/HaT5")
11
- tokenizer = T5Tokenizer.from_pretrained("t5-base") # use the source tokenizer because T5 finetuned tokenizer breaks
12
  tokenizer.pad_token = tokenizer.eos_token
13
  input_ids = tokenizer("Old lions in the wild lay down and die with dignity when they can't hunt anymore. If a government is having 'teething problems' handling aid supplies one full year into a pandemic, maybe it should take a cue and get the fuck out of the way? ", padding=True, truncation=True, return_tensors='pt').input_ids
14
  outputs = model.generate(input_ids)
15
  pred = tokenizer.decode(outputs[0], skip_special_tokens=True)
16
- print(pred)
 
 
 
1
  ### HaT5(T5-base)
2
  This is a fine-tuned model of T5 (base) on the hate speech detection dataset. It is intended to be used as a classification model for identifying Tweets (0 - HOF(hate/offensive); 1 - NOT). The task prefix we used for the T5 model is 'classification: '.
3
 
4
+ More information about the original pre-trained model can be found [here](https://huggingface.co/t5-base)
5
 
6
  Classification examples:
7
 
8
+ |Prediction|Tweet|
9
+ |-----|--------|
10
+ |0 |Why the fuck I got over 1000 views on my story ๐Ÿ˜‚๐Ÿ˜‚ nothing new over here |
11
+ |1. |first of all there is no vaccine to cure , whthr it is capsules, tablets or injections, they just support to fight with d virus. I do not support people taking any kind of home remedies n making fun of an ayurvedic medicine..๐Ÿ˜ |
12
+
13
+ # How to use
14
+ ```python
15
+
16
  from transformers import T5ForConditionalGeneration, T5Tokenizer
17
  import torch
18
  model = T5ForConditionalGeneration.from_pretrained("sana-ngu/HaT5")
19
+ tokenizer = T5Tokenizer.from_pretrained("t5-base")
20
  tokenizer.pad_token = tokenizer.eos_token
21
  input_ids = tokenizer("Old lions in the wild lay down and die with dignity when they can't hunt anymore. If a government is having 'teething problems' handling aid supplies one full year into a pandemic, maybe it should take a cue and get the fuck out of the way? ", padding=True, truncation=True, return_tensors='pt').input_ids
22
  outputs = model.generate(input_ids)
23
  pred = tokenizer.decode(outputs[0], skip_special_tokens=True)
24
+ print(pred)
25
+
26
+ ```