mmuratardag commited on
Commit
eaeef66
·
1 Parent(s): f36501a

updated model & README

Browse files
Files changed (4) hide show
  1. README.md +256 -1
  2. config.json +1 -1
  3. model.safetensors +1 -1
  4. pytorch_model.bin +1 -1
README.md CHANGED
@@ -2,4 +2,259 @@
2
  license: gpl-3.0
3
  language:
4
  - en
5
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
  license: gpl-3.0
3
  language:
4
  - en
5
+ ---
6
+ # Model Card for Model ID
7
+
8
+ This model is a fine-tuned RoBERTa-based classifier designed to predict the presence of ten moral foundations (five virtues and five vices) within text. It can be used for various text analysis tasks, such as content analysis, opinion mining.
9
+
10
+ ## Model Details
11
+
12
+ ### Model Description
13
+
14
+ - **Developed by:** M. Murat Ardag
15
+ - **Funded by [optional]:** not relevant
16
+ - **Shared by [optional]:** not relevant
17
+ - **Model type:** Multi-label Classification
18
+ - **License:** GPL-3.0
19
+ - **Finetuned from model [optional]:** roberta-base
20
+
21
+ ### Model Sources [optional]
22
+
23
+ - **Repository:** https://huggingface.co/MMADS/MoralFoundationsClassifier
24
+ - **See the following sources & papers for alternatives:**
25
+ - https://moralfoundations.org/other-materials/
26
+ - Frimer, J. A., Boghrati, R., Haidt, J., Graham, J., & Dehgani, M. (2019). Moral Foundations Dictionary for Linguistic Analyses 2.0. Unpublished manuscript.
27
+ - Hopp FR, Fisher JT, Cornell D, Huskey R, Weber R. The extended Moral Foundations Dictionary (eMFD): Development and applications of a crowd-sourced approach to extracting moral intuitions from text. Behav Res Methods. 2021 Feb;53(1):232-246. doi: 10.3758/s13428-020-01433-0. PMID: 32666393.
28
+ - Kennedy B, Atari M, Mostafazadeh Davani A, Hoover J, Omrani A, Graham J, Dehghani M. Moral concerns are differentially observable in language. Cognition. 2021 Jul;212:104696. doi: 10.1016/j.cognition.2021.104696. Epub 2021 Mar 31. PMID: 33812153.
29
+
30
+
31
+ ## Uses
32
+
33
+
34
+ ### Direct Use
35
+
36
+ The model can be directly used for classifying text into the following moral foundations:
37
+
38
+ **Care**: Care/harm for others, protecting them from harm.
39
+ **Fairness**: Justice, treating others equally.
40
+ **Loyalty**: Group loyalty, patriotism, self-sacrifice for the group.
41
+ **Authority**: Respect for tradition and legitimate authority.
42
+ **Sanctity**: Disgust, avoiding dangerous diseases and contaminants.
43
+
44
+ Each foundation is represented as a virtue (positive expression) and a vice (negative expression).
45
+
46
+ It's particularly useful for researchers, policymakers, and analysts interested in understanding moral reasoning and rhetoric in different contexts.
47
+
48
+ ### Downstream Use
49
+
50
+ Potential downstream uses include:
51
+
52
+ **Content analysis**: Analyzing the moral framing of news articles, social media posts, or other types of text.
53
+ **Opinion mining**: Understanding the moral values underlying people's opinions and arguments.
54
+ **Ethical assessment**: Evaluating the ethical implications of decisions, policies, or products.
55
+
56
+ ### Out-of-Scope Use
57
+
58
+ - This model is not designed for predicting specific actions or behaviors based on moral foundations.
59
+ - It may not accurately generalize to texts that are significantly different in style or domain from its training data **(NOT RECOMMENDED FOR SOCIAL MEDIA DATA)**
60
+ - The model should not be used to make definitive judgments about the morality of individuals or groups.
61
+
62
+ ## Bias, Risks, and Limitations
63
+
64
+ - The model is trained on a diverse corpus but may still reflect biases present in the data.
65
+ - It may be sensitive to specific wording or phrasing, leading to inconsistent predictions.
66
+ - The interpretation of moral foundations can be subjective, and the model's predictions may not always align with human judgment.
67
+
68
+ ### Recommendations
69
+
70
+ - Be aware of the potential biases and limitations of the model.
71
+ - Carefully consider the context and purpose of the text being analyzed.
72
+ - Use the model's predictions as a starting point for further analysis and discussion.
73
+
74
+
75
+ ## How to Get Started with the Model
76
+
77
+ Use the code below to get started with the model.
78
+
79
+ ```python
80
+ import torch
81
+ from transformers import RobertaTokenizer, RobertaForSequenceClassification
82
+ import json
83
+
84
+ # Load the model, tokenizer, and configuration
85
+ model_path = "MMADS/MoralFoundationsClassifier"
86
+ model = RobertaForSequenceClassification.from_pretrained(model_path)
87
+ tokenizer = RobertaTokenizer.from_pretrained(model_path)
88
+
89
+ # Load label names
90
+ with open(f"{model_path}/label_names.json", 'r') as f:
91
+ label_names = json.load(f)
92
+
93
+ # Your function to make predictions
94
+ ### ...
95
+
96
+ ```
97
+
98
+ ## Training Details
99
+
100
+ ### Training Data
101
+
102
+ The model was trained on a diverse corpus of text, including:
103
+
104
+ - Personal blogs
105
+ - Political blogs
106
+ - News media articles
107
+ - Essays
108
+ - Short stories
109
+ - Parliamentary debates from US, UK, CA, NZ
110
+ - Speeches at the UN
111
+ - Speeches by central bank presidents
112
+
113
+ The trainng data is a subset of >60M sentences.
114
+
115
+ ### Training Procedure
116
+
117
+ The model was fine-tuned using the HuggingFace Transformers library with the following hyperparameters:
118
+
119
+ Num_train_epochs: 10
120
+ Per_device_train_batch_size: 8
121
+ Per_device_eval_batch_size: 8
122
+ Learning rate: 3e-5
123
+ Optimizer: AdamW
124
+ Loss function: Binary Cross Entropy with Logits Loss
125
+
126
+ ## Evaluation
127
+
128
+ ### Testing Data, Factors & Metrics
129
+
130
+ #### Testing Data
131
+
132
+ The model was evaluated on a held-out portion of the training data.
133
+
134
+ #### Metrics
135
+
136
+ Metrics
137
+ - **Overall Accuracy:** 0.9957
138
+ - **Overall Precision:** 0.9957
139
+ - **Overall Recall:** 0.9957
140
+ - **Overall F1-score:** 0.9957
141
+
142
+ See the section below for detailed per-class metrics.
143
+
144
+ ### Results
145
+
146
+ The model achieves high overall performance, with variations across different moral foundations. The "*loyalty_vice*" category has a notably lower F1-score due to low recall, indicating difficulty in identifying this specific vice.
147
+
148
+ ***Per-class metrics:***
149
+
150
+ care_virtue:
151
+ accuracy: 0.9954
152
+ precision: 0.9779
153
+ recall: 0.9758
154
+ f1: 0.9769
155
+
156
+ care_vice:
157
+ accuracy: 0.9960
158
+ precision: 0.9734
159
+ recall: 0.9506
160
+ f1: 0.9619
161
+
162
+ fairness_virtue:
163
+ accuracy: 0.9974
164
+ precision: 0.9786
165
+ recall: 0.9645
166
+ f1: 0.9715
167
+
168
+ fairness_vice:
169
+ accuracy: 0.9970
170
+ precision: 0.9319
171
+ recall: 0.8574
172
+ f1: 0.8931
173
+
174
+ loyalty_virtue:
175
+ accuracy: 0.9945
176
+ precision: 0.9811
177
+ recall: 0.9780
178
+ f1: 0.9795
179
+
180
+ loyalty_vice:
181
+ accuracy: 0.9972
182
+ precision: 1.0000
183
+ recall: 0.0531
184
+ f1: 0.1008
185
+
186
+ authority_virtue:
187
+ accuracy: 0.9914
188
+ precision: 0.9621
189
+ recall: 0.9683
190
+ f1: 0.9652
191
+
192
+ authority_vice:
193
+ accuracy: 0.9963
194
+ precision: 0.9848
195
+ recall: 0.5838
196
+ f1: 0.7331
197
+
198
+ sanctity_virtue:
199
+ accuracy: 0.9963
200
+ precision: 0.9640
201
+ recall: 0.9458
202
+ f1: 0.9548
203
+
204
+ sanctity_vice:
205
+ accuracy: 0.9958
206
+ precision: 0.9538
207
+ recall: 0.8530
208
+ f1: 0.9006
209
+
210
+
211
+
212
+ ## Model Examination
213
+
214
+
215
+ ## Environmental Impact
216
+
217
+ **Minimal**
218
+ Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
219
+
220
+ - **Hardware Type:** Google Colab GPU
221
+ - **Hours used:** 8
222
+ - **Cloud Provider:** Google
223
+ - **Compute Region:** South Carolina
224
+
225
+ ## Technical Specifications
226
+
227
+ ### Model Architecture and Objective
228
+
229
+ The model is based on the RoBERTa architecture, fine-tuned for multi-label classification. It uses a custom loss function (BCEWithLogitsLoss) to handle the multi-label nature of the task.
230
+
231
+
232
+ ## More Information
233
+
234
+ This model is based on Moral Foundations Theory, a social psychological theory that explains the origins and variation in human moral reasoning. The theory proposes six moral foundations: Care/Harm, Fairness/Cheating, Loyalty/Betrayal, Authority/Subversion, Sanctity/Degradation, and Liberty/Oppression.
235
+
236
+ ***Though, the current model focuses on the first five.***
237
+
238
+ ## Model Card Authors
239
+
240
+ M. Murat Ardag
241
+
242
+ ## Model Card Contact
243
+
244
+ via my personal website. thx
245
+
246
+ ## Citation
247
+
248
+ *** If you use this model in your research or applications, please cite it as follows:***
249
+
250
+ Ardag, M.M. (2024) Moral Foundations Classifier. HuggingFace. https://doi.org/10.57967/hf/2774
251
+
252
+ ## Glossary
253
+
254
+ Moral Foundations Theory identifies five core foundations guiding human morality across cultures:
255
+
256
+ * **Care/Harm:** Rooted in empathy and attachment, it values kindness and compassion.
257
+ * **Fairness/Cheating:** Based on reciprocal altruism, it emphasizes justice and fairness.
258
+ * **Loyalty/Betrayal:** Stemming from humans' tribal past, it values group loyalty and condemns betrayal.
259
+ * **Authority/Subversion:** Shaped by hierarchical social structures, it values respect for authority and tradition.
260
+ * **Sanctity/Degradation:** Linked to disgust and purity, it values living in a noble way and avoiding degradation.
config.json CHANGED
@@ -52,7 +52,7 @@
52
  "num_attention_heads": 12,
53
  "num_hidden_layers": 12,
54
  "pad_token_id": 1,
55
- "transformers_version": "4.43.1",
56
  "type_vocab_size": 1,
57
  "vocab_size": 50265
58
  }
 
52
  "num_attention_heads": 12,
53
  "num_hidden_layers": 12,
54
  "pad_token_id": 1,
55
+ "transformers_version": "4.43.2",
56
  "type_vocab_size": 1,
57
  "vocab_size": 50265
58
  }
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:a94cc65b4e1b368352d30a84aa215e26b8aca38d4e1980cefd386f96de2e5f1a
3
  size 498637432
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a1eb4f67676ae5514b0c44d6e6a260392e19c19c4b86599e684460911edd4a8b
3
  size 498637432
pytorch_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:36aa5f3e4ccb9d54515a4f1158d14b31d6725c98993ad3503f8d9210295bc94f
3
  size 498697326
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:de385a12fdaf92a7bd329f154f338de4e4f9261c6c57f964ade756132ac3a508
3
  size 498697326