mmuratardag
commited on
Commit
·
eaeef66
1
Parent(s):
f36501a
updated model & README
Browse files- README.md +256 -1
- config.json +1 -1
- model.safetensors +1 -1
- pytorch_model.bin +1 -1
README.md
CHANGED
@@ -2,4 +2,259 @@
|
|
2 |
license: gpl-3.0
|
3 |
language:
|
4 |
- en
|
5 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2 |
license: gpl-3.0
|
3 |
language:
|
4 |
- en
|
5 |
+
---
|
6 |
+
# Model Card for Model ID
|
7 |
+
|
8 |
+
This model is a fine-tuned RoBERTa-based classifier designed to predict the presence of ten moral foundations (five virtues and five vices) within text. It can be used for various text analysis tasks, such as content analysis, opinion mining.
|
9 |
+
|
10 |
+
## Model Details
|
11 |
+
|
12 |
+
### Model Description
|
13 |
+
|
14 |
+
- **Developed by:** M. Murat Ardag
|
15 |
+
- **Funded by [optional]:** not relevant
|
16 |
+
- **Shared by [optional]:** not relevant
|
17 |
+
- **Model type:** Multi-label Classification
|
18 |
+
- **License:** GPL-3.0
|
19 |
+
- **Finetuned from model [optional]:** roberta-base
|
20 |
+
|
21 |
+
### Model Sources [optional]
|
22 |
+
|
23 |
+
- **Repository:** https://huggingface.co/MMADS/MoralFoundationsClassifier
|
24 |
+
- **See the following sources & papers for alternatives:**
|
25 |
+
- https://moralfoundations.org/other-materials/
|
26 |
+
- Frimer, J. A., Boghrati, R., Haidt, J., Graham, J., & Dehgani, M. (2019). Moral Foundations Dictionary for Linguistic Analyses 2.0. Unpublished manuscript.
|
27 |
+
- Hopp FR, Fisher JT, Cornell D, Huskey R, Weber R. The extended Moral Foundations Dictionary (eMFD): Development and applications of a crowd-sourced approach to extracting moral intuitions from text. Behav Res Methods. 2021 Feb;53(1):232-246. doi: 10.3758/s13428-020-01433-0. PMID: 32666393.
|
28 |
+
- Kennedy B, Atari M, Mostafazadeh Davani A, Hoover J, Omrani A, Graham J, Dehghani M. Moral concerns are differentially observable in language. Cognition. 2021 Jul;212:104696. doi: 10.1016/j.cognition.2021.104696. Epub 2021 Mar 31. PMID: 33812153.
|
29 |
+
|
30 |
+
|
31 |
+
## Uses
|
32 |
+
|
33 |
+
|
34 |
+
### Direct Use
|
35 |
+
|
36 |
+
The model can be directly used for classifying text into the following moral foundations:
|
37 |
+
|
38 |
+
**Care**: Care/harm for others, protecting them from harm.
|
39 |
+
**Fairness**: Justice, treating others equally.
|
40 |
+
**Loyalty**: Group loyalty, patriotism, self-sacrifice for the group.
|
41 |
+
**Authority**: Respect for tradition and legitimate authority.
|
42 |
+
**Sanctity**: Disgust, avoiding dangerous diseases and contaminants.
|
43 |
+
|
44 |
+
Each foundation is represented as a virtue (positive expression) and a vice (negative expression).
|
45 |
+
|
46 |
+
It's particularly useful for researchers, policymakers, and analysts interested in understanding moral reasoning and rhetoric in different contexts.
|
47 |
+
|
48 |
+
### Downstream Use
|
49 |
+
|
50 |
+
Potential downstream uses include:
|
51 |
+
|
52 |
+
**Content analysis**: Analyzing the moral framing of news articles, social media posts, or other types of text.
|
53 |
+
**Opinion mining**: Understanding the moral values underlying people's opinions and arguments.
|
54 |
+
**Ethical assessment**: Evaluating the ethical implications of decisions, policies, or products.
|
55 |
+
|
56 |
+
### Out-of-Scope Use
|
57 |
+
|
58 |
+
- This model is not designed for predicting specific actions or behaviors based on moral foundations.
|
59 |
+
- It may not accurately generalize to texts that are significantly different in style or domain from its training data **(NOT RECOMMENDED FOR SOCIAL MEDIA DATA)**
|
60 |
+
- The model should not be used to make definitive judgments about the morality of individuals or groups.
|
61 |
+
|
62 |
+
## Bias, Risks, and Limitations
|
63 |
+
|
64 |
+
- The model is trained on a diverse corpus but may still reflect biases present in the data.
|
65 |
+
- It may be sensitive to specific wording or phrasing, leading to inconsistent predictions.
|
66 |
+
- The interpretation of moral foundations can be subjective, and the model's predictions may not always align with human judgment.
|
67 |
+
|
68 |
+
### Recommendations
|
69 |
+
|
70 |
+
- Be aware of the potential biases and limitations of the model.
|
71 |
+
- Carefully consider the context and purpose of the text being analyzed.
|
72 |
+
- Use the model's predictions as a starting point for further analysis and discussion.
|
73 |
+
|
74 |
+
|
75 |
+
## How to Get Started with the Model
|
76 |
+
|
77 |
+
Use the code below to get started with the model.
|
78 |
+
|
79 |
+
```python
|
80 |
+
import torch
|
81 |
+
from transformers import RobertaTokenizer, RobertaForSequenceClassification
|
82 |
+
import json
|
83 |
+
|
84 |
+
# Load the model, tokenizer, and configuration
|
85 |
+
model_path = "MMADS/MoralFoundationsClassifier"
|
86 |
+
model = RobertaForSequenceClassification.from_pretrained(model_path)
|
87 |
+
tokenizer = RobertaTokenizer.from_pretrained(model_path)
|
88 |
+
|
89 |
+
# Load label names
|
90 |
+
with open(f"{model_path}/label_names.json", 'r') as f:
|
91 |
+
label_names = json.load(f)
|
92 |
+
|
93 |
+
# Your function to make predictions
|
94 |
+
### ...
|
95 |
+
|
96 |
+
```
|
97 |
+
|
98 |
+
## Training Details
|
99 |
+
|
100 |
+
### Training Data
|
101 |
+
|
102 |
+
The model was trained on a diverse corpus of text, including:
|
103 |
+
|
104 |
+
- Personal blogs
|
105 |
+
- Political blogs
|
106 |
+
- News media articles
|
107 |
+
- Essays
|
108 |
+
- Short stories
|
109 |
+
- Parliamentary debates from US, UK, CA, NZ
|
110 |
+
- Speeches at the UN
|
111 |
+
- Speeches by central bank presidents
|
112 |
+
|
113 |
+
The trainng data is a subset of >60M sentences.
|
114 |
+
|
115 |
+
### Training Procedure
|
116 |
+
|
117 |
+
The model was fine-tuned using the HuggingFace Transformers library with the following hyperparameters:
|
118 |
+
|
119 |
+
Num_train_epochs: 10
|
120 |
+
Per_device_train_batch_size: 8
|
121 |
+
Per_device_eval_batch_size: 8
|
122 |
+
Learning rate: 3e-5
|
123 |
+
Optimizer: AdamW
|
124 |
+
Loss function: Binary Cross Entropy with Logits Loss
|
125 |
+
|
126 |
+
## Evaluation
|
127 |
+
|
128 |
+
### Testing Data, Factors & Metrics
|
129 |
+
|
130 |
+
#### Testing Data
|
131 |
+
|
132 |
+
The model was evaluated on a held-out portion of the training data.
|
133 |
+
|
134 |
+
#### Metrics
|
135 |
+
|
136 |
+
Metrics
|
137 |
+
- **Overall Accuracy:** 0.9957
|
138 |
+
- **Overall Precision:** 0.9957
|
139 |
+
- **Overall Recall:** 0.9957
|
140 |
+
- **Overall F1-score:** 0.9957
|
141 |
+
|
142 |
+
See the section below for detailed per-class metrics.
|
143 |
+
|
144 |
+
### Results
|
145 |
+
|
146 |
+
The model achieves high overall performance, with variations across different moral foundations. The "*loyalty_vice*" category has a notably lower F1-score due to low recall, indicating difficulty in identifying this specific vice.
|
147 |
+
|
148 |
+
***Per-class metrics:***
|
149 |
+
|
150 |
+
care_virtue:
|
151 |
+
accuracy: 0.9954
|
152 |
+
precision: 0.9779
|
153 |
+
recall: 0.9758
|
154 |
+
f1: 0.9769
|
155 |
+
|
156 |
+
care_vice:
|
157 |
+
accuracy: 0.9960
|
158 |
+
precision: 0.9734
|
159 |
+
recall: 0.9506
|
160 |
+
f1: 0.9619
|
161 |
+
|
162 |
+
fairness_virtue:
|
163 |
+
accuracy: 0.9974
|
164 |
+
precision: 0.9786
|
165 |
+
recall: 0.9645
|
166 |
+
f1: 0.9715
|
167 |
+
|
168 |
+
fairness_vice:
|
169 |
+
accuracy: 0.9970
|
170 |
+
precision: 0.9319
|
171 |
+
recall: 0.8574
|
172 |
+
f1: 0.8931
|
173 |
+
|
174 |
+
loyalty_virtue:
|
175 |
+
accuracy: 0.9945
|
176 |
+
precision: 0.9811
|
177 |
+
recall: 0.9780
|
178 |
+
f1: 0.9795
|
179 |
+
|
180 |
+
loyalty_vice:
|
181 |
+
accuracy: 0.9972
|
182 |
+
precision: 1.0000
|
183 |
+
recall: 0.0531
|
184 |
+
f1: 0.1008
|
185 |
+
|
186 |
+
authority_virtue:
|
187 |
+
accuracy: 0.9914
|
188 |
+
precision: 0.9621
|
189 |
+
recall: 0.9683
|
190 |
+
f1: 0.9652
|
191 |
+
|
192 |
+
authority_vice:
|
193 |
+
accuracy: 0.9963
|
194 |
+
precision: 0.9848
|
195 |
+
recall: 0.5838
|
196 |
+
f1: 0.7331
|
197 |
+
|
198 |
+
sanctity_virtue:
|
199 |
+
accuracy: 0.9963
|
200 |
+
precision: 0.9640
|
201 |
+
recall: 0.9458
|
202 |
+
f1: 0.9548
|
203 |
+
|
204 |
+
sanctity_vice:
|
205 |
+
accuracy: 0.9958
|
206 |
+
precision: 0.9538
|
207 |
+
recall: 0.8530
|
208 |
+
f1: 0.9006
|
209 |
+
|
210 |
+
|
211 |
+
|
212 |
+
## Model Examination
|
213 |
+
|
214 |
+
|
215 |
+
## Environmental Impact
|
216 |
+
|
217 |
+
**Minimal**
|
218 |
+
Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
|
219 |
+
|
220 |
+
- **Hardware Type:** Google Colab GPU
|
221 |
+
- **Hours used:** 8
|
222 |
+
- **Cloud Provider:** Google
|
223 |
+
- **Compute Region:** South Carolina
|
224 |
+
|
225 |
+
## Technical Specifications
|
226 |
+
|
227 |
+
### Model Architecture and Objective
|
228 |
+
|
229 |
+
The model is based on the RoBERTa architecture, fine-tuned for multi-label classification. It uses a custom loss function (BCEWithLogitsLoss) to handle the multi-label nature of the task.
|
230 |
+
|
231 |
+
|
232 |
+
## More Information
|
233 |
+
|
234 |
+
This model is based on Moral Foundations Theory, a social psychological theory that explains the origins and variation in human moral reasoning. The theory proposes six moral foundations: Care/Harm, Fairness/Cheating, Loyalty/Betrayal, Authority/Subversion, Sanctity/Degradation, and Liberty/Oppression.
|
235 |
+
|
236 |
+
***Though, the current model focuses on the first five.***
|
237 |
+
|
238 |
+
## Model Card Authors
|
239 |
+
|
240 |
+
M. Murat Ardag
|
241 |
+
|
242 |
+
## Model Card Contact
|
243 |
+
|
244 |
+
via my personal website. thx
|
245 |
+
|
246 |
+
## Citation
|
247 |
+
|
248 |
+
*** If you use this model in your research or applications, please cite it as follows:***
|
249 |
+
|
250 |
+
Ardag, M.M. (2024) Moral Foundations Classifier. HuggingFace. https://doi.org/10.57967/hf/2774
|
251 |
+
|
252 |
+
## Glossary
|
253 |
+
|
254 |
+
Moral Foundations Theory identifies five core foundations guiding human morality across cultures:
|
255 |
+
|
256 |
+
* **Care/Harm:** Rooted in empathy and attachment, it values kindness and compassion.
|
257 |
+
* **Fairness/Cheating:** Based on reciprocal altruism, it emphasizes justice and fairness.
|
258 |
+
* **Loyalty/Betrayal:** Stemming from humans' tribal past, it values group loyalty and condemns betrayal.
|
259 |
+
* **Authority/Subversion:** Shaped by hierarchical social structures, it values respect for authority and tradition.
|
260 |
+
* **Sanctity/Degradation:** Linked to disgust and purity, it values living in a noble way and avoiding degradation.
|
config.json
CHANGED
@@ -52,7 +52,7 @@
|
|
52 |
"num_attention_heads": 12,
|
53 |
"num_hidden_layers": 12,
|
54 |
"pad_token_id": 1,
|
55 |
-
"transformers_version": "4.43.
|
56 |
"type_vocab_size": 1,
|
57 |
"vocab_size": 50265
|
58 |
}
|
|
|
52 |
"num_attention_heads": 12,
|
53 |
"num_hidden_layers": 12,
|
54 |
"pad_token_id": 1,
|
55 |
+
"transformers_version": "4.43.2",
|
56 |
"type_vocab_size": 1,
|
57 |
"vocab_size": 50265
|
58 |
}
|
model.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 498637432
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:a1eb4f67676ae5514b0c44d6e6a260392e19c19c4b86599e684460911edd4a8b
|
3 |
size 498637432
|
pytorch_model.bin
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 498697326
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:de385a12fdaf92a7bd329f154f338de4e4f9261c6c57f964ade756132ac3a508
|
3 |
size 498697326
|