Update README.md
Browse files
README.md
CHANGED
|
@@ -32,12 +32,6 @@ The 28 labels from the [go_emotions](https://huggingface.co/datasets/go_emotions
|
|
| 32 |
|
| 33 |
This is a multi-label, multi-class dataset, so each label is effectively a separate binary classification. Evaluating across all labels per item in the go_emotions test split the metrics are shown below.
|
| 34 |
|
| 35 |
-
Using a fixed threshold of 0.5 to convert the scores to binary predictions for each label, the metrics (evaluated on the go_emotions test split) are:
|
| 36 |
-
|
| 37 |
-
- Precision: 0.602
|
| 38 |
-
- Recall: 0.250
|
| 39 |
-
- F1: 0.303
|
| 40 |
-
|
| 41 |
Optimising the threshold per label to optimise the F1 metric, the metrics (evaluated on the go_emotions test split) are:
|
| 42 |
|
| 43 |
- Precision: 0.445
|
|
@@ -50,6 +44,12 @@ Weighted by the relative support of each label in the dataset, this is:
|
|
| 50 |
- Recall: 0.582
|
| 51 |
- F1: 0.514
|
| 52 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 53 |
### Metrics (per-label)
|
| 54 |
|
| 55 |
This is a multi-label, multi-class dataset, so each label is effectively a separate binary classification and metrics are better measured per label.
|
|
|
|
| 32 |
|
| 33 |
This is a multi-label, multi-class dataset, so each label is effectively a separate binary classification. Evaluating across all labels per item in the go_emotions test split the metrics are shown below.
|
| 34 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 35 |
Optimising the threshold per label to optimise the F1 metric, the metrics (evaluated on the go_emotions test split) are:
|
| 36 |
|
| 37 |
- Precision: 0.445
|
|
|
|
| 44 |
- Recall: 0.582
|
| 45 |
- F1: 0.514
|
| 46 |
|
| 47 |
+
Using a fixed threshold of 0.5 to convert the scores to binary predictions for each label, the metrics (evaluated on the go_emotions test split, and unweighted by support) are:
|
| 48 |
+
|
| 49 |
+
- Precision: 0.602
|
| 50 |
+
- Recall: 0.250
|
| 51 |
+
- F1: 0.303
|
| 52 |
+
|
| 53 |
### Metrics (per-label)
|
| 54 |
|
| 55 |
This is a multi-label, multi-class dataset, so each label is effectively a separate binary classification and metrics are better measured per label.
|