ifmain commited on
Commit
d02c985
·
verified ·
1 Parent(s): 32fa5b8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +112 -3
README.md CHANGED
@@ -1,3 +1,112 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ ---
4
+ # ModerationBERT-ML-En
5
+
6
+ **ModerationBERT-ML-En** is a moderation model based on `bert-base-multilingual-cased`. This model is designed to perform text moderation tasks, specifically categorizing text into 18 different categories. It currently works only with English text.
7
+
8
+ ## Dataset
9
+
10
+ The model was trained and fine-tuned using the [text-moderation-410K](https://huggingface.co/datasets/ifmain/text-moderation-410K) dataset. This dataset contains a wide variety of text samples labeled with different moderation categories.
11
+
12
+ ## Model Description
13
+
14
+ ModerationBERT-ML-En uses the BERT architecture to classify text into the following categories:
15
+ - harassment
16
+ - harassment_threatening
17
+ - hate
18
+ - hate_threatening
19
+ - self_harm
20
+ - self_harm_instructions
21
+ - self_harm_intent
22
+ - sexual
23
+ - sexual_minors
24
+ - violence
25
+ - violence_graphic
26
+ - self-harm
27
+ - sexual/minors
28
+ - hate/threatening
29
+ - violence/graphic
30
+ - self-harm/intent
31
+ - self-harm/instructions
32
+ - harassment/threatening
33
+
34
+ ## Training and Fine-Tuning
35
+
36
+ The model was trained using a 95% subset of the dataset for training and a 5% subset for evaluation. The training was performed in two stages:
37
+
38
+ 1. **Initial Training**: The classifier layer was trained with frozen BERT layers.
39
+ 2. **Fine-Tuning**: The top two layers of the BERT model were unfrozen and the entire model was fine-tuned.
40
+
41
+ ## Installation
42
+
43
+ To use ModerationBERT-ML-En, you will need to install the `transformers` library from Hugging Face and `torch`.
44
+
45
+ ```bash
46
+ pip install transformers torch
47
+ ```
48
+
49
+ ## Usage
50
+
51
+ Here is an example of how to use ModerationBERT-ML-En to predict the moderation categories for a given text:
52
+
53
+ ```python
54
+ import json
55
+ import torch
56
+ from transformers import BertTokenizer, BertForSequenceClassification
57
+
58
+ # Load the tokenizer and model
59
+ model_name = "ModerationBERT-ML-En"
60
+ tokenizer = BertTokenizer.from_pretrained(model_name)
61
+ model = BertForSequenceClassification.from_pretrained(model_name, num_labels=18)
62
+
63
+ # Device configuration
64
+ device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
65
+ model.to(device)
66
+
67
+ def predict(text, model, tokenizer):
68
+ encoding = tokenizer.encode_plus(
69
+ text,
70
+ add_special_tokens=True,
71
+ max_length=128,
72
+ return_token_type_ids=False,
73
+ padding='max_length',
74
+ truncation=True,
75
+ return_attention_mask=True,
76
+ return_tensors='pt'
77
+ )
78
+ input_ids = encoding['input_ids'].to(device)
79
+ attention_mask = encoding['attention_mask'].to(device)
80
+ model.eval()
81
+ with torch.no_grad():
82
+ outputs = model(input_ids, attention_mask=attention_mask)
83
+ predictions = torch.sigmoid(outputs.logits) # Convert logits to probabilities
84
+ return predictions
85
+
86
+ # Example usage
87
+ new_text = "This isn't Twitter: try to comment on the article, and not your current activities."
88
+ predictions = predict(new_text, model, tokenizer)
89
+
90
+ # Define the categories
91
+ categories = ['harassment', 'harassment_threatening', 'hate', 'hate_threatening',
92
+ 'self_harm', 'self_harm_instructions', 'self_harm_intent', 'sexual',
93
+ 'sexual_minors', 'violence', 'violence_graphic', 'self-harm',
94
+ 'sexual/minors', 'hate/threatening', 'violence/graphic',
95
+ 'self-harm/intent', 'self-harm/instructions', 'harassment/threatening']
96
+
97
+ # Convert predictions to a dictionary
98
+ category_scores = {categories[i]: predictions[0][i].item() for i in range(len(categories))}
99
+
100
+ output = {
101
+ "text": new_text,
102
+ "category_scores": category_scores
103
+ }
104
+
105
+ # Print the result as a JSON with indentation
106
+ print(json.dumps(output, indent=4, ensure_ascii=False))
107
+ ```
108
+
109
+ ## Notes
110
+
111
+ - This model is currently configured to work only with English text.
112
+ - Future updates may include support for additional languages.