Update README.md
Browse files
README.md
CHANGED
@@ -31,83 +31,83 @@ model-index:
|
|
31 |
split: test
|
32 |
metrics:
|
33 |
- type: f1
|
34 |
-
value: 0.
|
35 |
name: F1 Score
|
36 |
- type: precision
|
37 |
-
value: 0.
|
38 |
name: Precision
|
39 |
- type: recall
|
40 |
-
value: 0.
|
41 |
name: Recall
|
42 |
- type: accuracy
|
43 |
-
value: 0.
|
44 |
name: Accuracy
|
45 |
---
|
46 |
|
47 |
-
|
48 |
|
49 |
-
This model is designed to **redact and classify Personally Identifiable Information (PII)** from multilingual text. It has been fine-tuned on the [open-pii-masking-500k-ai4privacy](https://huggingface.co/datasets/ai4privacy/open-pii-masking-500k-ai4privacy) dataset and supports multiple languages including French (fr), English (en), German (de), Telugu (te), Hindi (hi), Italian (it), Spanish (es), and Dutch (nl).
|
50 |
|
51 |
---
|
52 |
|
53 |
## Evaluation Metrics
|
54 |
|
55 |
-
The table below summarizes the detailed evaluation results per PII label
|
56 |
|
57 |
| **Label** | **TP** | **FP** | **FN** | **Accuracy** | **Precision** | **Recall** | **F1 Score** |
|
58 |
|--------------------|:------:|:------:|:------|:------------:|:-------------:|:----------:|:------------:|
|
59 |
-
|
|
60 |
-
|
|
61 |
-
|
|
62 |
-
|
|
63 |
-
|
|
64 |
-
|
|
65 |
-
|
|
66 |
-
|
|
67 |
-
|
|
68 |
-
|
|
69 |
-
|
|
70 |
-
|
|
71 |
-
| IDCARDNUM |
|
72 |
-
|
|
73 |
-
|
|
74 |
-
|
|
75 |
-
|
|
76 |
-
|
|
77 |
-
|
|
78 |
-
|
|
79 |
-
|
|
80 |
|
81 |
### Overall Evaluation
|
82 |
-
- **Accuracy:** 95.
|
83 |
-
- **Precision:**
|
84 |
-
- **Recall:**
|
85 |
-
- **F1 Score:**
|
86 |
|
87 |
-
- **Total True Positives (TP):** 27,
|
88 |
-
- **Total False Positives (FP):** 3,
|
89 |
-
- **Total False Negatives (FN):**
|
90 |
|
91 |
### Macro-Averaged Metrics
|
92 |
-
- **Accuracy:**
|
93 |
-
- **Precision:**
|
94 |
-
- **Recall:**
|
95 |
-
- **F1 Score:**
|
96 |
|
97 |
---
|
98 |
|
99 |
## Model Behavior & Limitations
|
100 |
|
101 |
- **Evaluation Focus:**
|
102 |
-
The metrics
|
103 |
|
104 |
- **Strengths:**
|
105 |
-
- High recall (
|
106 |
-
-
|
107 |
|
108 |
- **Limitations:**
|
109 |
-
- Lower precision for
|
110 |
-
- The "O" (Non-PII) label has no true positives,
|
111 |
|
112 |
---
|
113 |
|
@@ -120,4 +120,6 @@ This model card details the evaluation metrics and fine-tuning parameters for th
|
|
120 |
|
121 |
---
|
122 |
|
123 |
-
*Ai4Privacy – Committed to protecting personal data in the age of AI.*
|
|
|
|
|
|
31 |
split: test
|
32 |
metrics:
|
33 |
- type: f1
|
34 |
+
value: 0.9150
|
35 |
name: F1 Score
|
36 |
- type: precision
|
37 |
+
value: 0.8761
|
38 |
name: Precision
|
39 |
- type: recall
|
40 |
+
value: 0.9576
|
41 |
name: Recall
|
42 |
- type: accuracy
|
43 |
+
value: 0.9503
|
44 |
name: Accuracy
|
45 |
---
|
46 |
|
47 |
+
# Multilingual Anonymiser OpenPII (Ai4Privacy)
|
48 |
|
49 |
+
This model is designed to **redact and classify Personally Identifiable Information (PII)** from multilingual text. It has been fine-tuned on the [open-pii-masking-500k-ai4privacy](https://huggingface.co/datasets/ai4privacy/open-pii-masking-500k-ai4privacy) dataset and supports multiple languages, including French (fr), English (en), German (de), Telugu (te), Hindi (hi), Italian (it), Spanish (es), and Dutch (nl).
|
50 |
|
51 |
---
|
52 |
|
53 |
## Evaluation Metrics
|
54 |
|
55 |
+
The table below summarizes the detailed evaluation results per PII label. Metrics are presented as percentages rounded to two decimal places. For the "O" (Non-PII) label, precision, recall, and F1 score are not applicable (n/a) due to the absence of true positives.
|
56 |
|
57 |
| **Label** | **TP** | **FP** | **FN** | **Accuracy** | **Precision** | **Recall** | **F1 Score** |
|
58 |
|--------------------|:------:|:------:|:------|:------------:|:-------------:|:----------:|:------------:|
|
59 |
+
| O (Non-PII) | 0 | 734 | 0 | 98.97% | n/a | n/a | n/a |
|
60 |
+
| GIVENNAME | 6623 | 661 | 352 | 86.73% | 90.93% | 94.95% | 92.90% |
|
61 |
+
| SURNAME | 2786 | 877 | 162 | 72.84% | 76.06% | 94.50% | 84.28% |
|
62 |
+
| CITY | 1763 | 216 | 225 | 79.99% | 89.09% | 88.68% | 88.88% |
|
63 |
+
| DATE | 2195 | 1 | 3 | 99.82% | 99.95% | 99.86% | 99.91% |
|
64 |
+
| AGE | 176 | 7 | 2 | 95.14% | 96.17% | 98.88% | 97.51% |
|
65 |
+
| EMAIL | 2981 | 0 | 0 | 100.0% | 100.0% | 100.0% | 100.0% |
|
66 |
+
| CREDITCARDNUMBER | 601 | 57 | 35 | 86.72% | 91.34% | 94.50% | 92.89% |
|
67 |
+
| SEX | 103 | 45 | 1 | 69.13% | 69.59% | 99.04% | 81.75% |
|
68 |
+
| SOCIALNUM | 364 | 134 | 20 | 70.27% | 73.09% | 94.79% | 82.54% |
|
69 |
+
| TIME | 1631 | 1 | 3 | 99.76% | 99.94% | 99.82% | 99.88% |
|
70 |
+
| TELEPHONENUM | 3537 | 10 | 9 | 99.47% | 99.72% | 99.75% | 99.73% |
|
71 |
+
| IDCARDNUM | 1540 | 314 | 148 | 76.92% | 83.06% | 91.23% | 86.96% |
|
72 |
+
| ZIPCODE | 311 | 39 | 16 | 84.97% | 88.86% | 95.11% | 91.87% |
|
73 |
+
| DRIVERLICENSENUM | 296 | 143 | 26 | 63.66% | 67.43% | 91.93% | 77.79% |
|
74 |
+
| PASSPORTNUM | 482 | 285 | 25 | 60.86% | 62.84% | 95.07% | 75.67% |
|
75 |
+
| TITLE | 224 | 68 | 78 | 60.54% | 76.71% | 74.17% | 75.42% |
|
76 |
+
| BUILDINGNUM | 292 | 45 | 14 | 83.19% | 86.65% | 95.42% | 90.85% |
|
77 |
+
| STREET | 1272 | 155 | 67 | 85.14% | 89.14% | 94.99% | 91.97% |
|
78 |
+
| TAXNUM | 471 | 101 | 34 | 77.72% | 82.34% | 93.27% | 87.47% |
|
79 |
+
| GENDER | 123 | 35 | 9 | 73.65% | 77.85% | 93.18% | 84.83% |
|
80 |
|
81 |
### Overall Evaluation
|
82 |
+
- **Accuracy:** 95.03%
|
83 |
+
- **Precision:** 87.61%
|
84 |
+
- **Recall:** 95.76%
|
85 |
+
- **F1 Score:** 91.50%
|
86 |
|
87 |
+
- **Total True Positives (TP):** 27,771
|
88 |
+
- **Total False Positives (FP):** 3,928
|
89 |
+
- **Total False Negatives (FN):** 1,229
|
90 |
|
91 |
### Macro-Averaged Metrics
|
92 |
+
- **Accuracy:** 82.17%
|
93 |
+
- **Precision:** 80.99%
|
94 |
+
- **Recall:** 89.96%
|
95 |
+
- **F1 Score:** 84.91%
|
96 |
|
97 |
---
|
98 |
|
99 |
## Model Behavior & Limitations
|
100 |
|
101 |
- **Evaluation Focus:**
|
102 |
+
The metrics above reflect performance on the test split of the [open-pii-masking-500k-ai4privacy](https://huggingface.co/datasets/ai4privacy/open-pii-masking-500k-ai4privacy) dataset. This model both redacts and classifies PII into specific categories (e.g., GIVENNAME, EMAIL). Real-world performance may vary depending on text domain and language, so additional validation is recommended. For support, contact **[email protected]**.
|
103 |
|
104 |
- **Strengths:**
|
105 |
+
- High recall (95.76%) ensures most PII is detected.
|
106 |
+
- Exceptional performance on labels like "EMAIL" (100% F1), "DATE" (99.91% F1), and "TIME" (99.88% F1).
|
107 |
|
108 |
- **Limitations:**
|
109 |
+
- Lower precision for labels such as "PASSPORTNUM" (62.84%) and "DRIVERLICENSENUM" (67.43%), indicating a higher rate of false positives.
|
110 |
+
- The "O" (Non-PII) label has no true positives, making precision, recall, and F1 score not applicable (n/a).
|
111 |
|
112 |
---
|
113 |
|
|
|
120 |
|
121 |
---
|
122 |
|
123 |
+
*Ai4Privacy – Committed to protecting personal data in the age of AI.*
|
124 |
+
|
125 |
+
---
|