MikeDoes's picture
Update README.md
83ef30d verified
---
license: mit
datasets:
- ai4privacy/open-pii-masking-500k-ai4privacy
language:
- fr
- en
- de
- te
- hi
- it
- es
- nl
base_model:
- answerdotai/ModernBERT-base
library_name: transformers
tags:
- PII
- redaction
- anonymisation
- token-classification
model-index:
- name: multilingual-anonymiser-openpii-ai4privacy
results:
- task:
type: token-classification
name: PII Masking and Classification
dataset:
type: ai4privacy/open-pii-masking-500k-ai4privacy
name: Open PII Masking 500K
split: test
metrics:
- type: f1
value: 0.9150
name: F1 Score
- type: precision
value: 0.8761
name: Precision
- type: recall
value: 0.9576
name: Recall
- type: accuracy
value: 0.9503
name: Accuracy
---
# Multilingual Anonymiser OpenPII (Ai4Privacy)
This model is designed to **redact and classify Personally Identifiable Information (PII)** from multilingual text. It has been fine-tuned on the [open-pii-masking-500k-ai4privacy](https://huggingface.co/datasets/ai4privacy/open-pii-masking-500k-ai4privacy) dataset and supports multiple languages, including French (fr), English (en), German (de), Telugu (te), Hindi (hi), Italian (it), Spanish (es), and Dutch (nl).
---
## Evaluation Metrics
The table below summarizes the detailed evaluation results per PII label. Metrics are presented as percentages rounded to two decimal places. For the "O" (Non-PII) label, precision, recall, and F1 score are not applicable (n/a) due to the absence of true positives.
| **Label** | **TP** | **FP** | **FN** | **Accuracy** | **Precision** | **Recall** | **F1 Score** |
|--------------------|:------:|:------:|:------|:------------:|:-------------:|:----------:|:------------:|
| O (Non-PII) | 0 | 734 | 0 | 98.97% | n/a | n/a | n/a |
| GIVENNAME | 6623 | 661 | 352 | 86.73% | 90.93% | 94.95% | 92.90% |
| SURNAME | 2786 | 877 | 162 | 72.84% | 76.06% | 94.50% | 84.28% |
| CITY | 1763 | 216 | 225 | 79.99% | 89.09% | 88.68% | 88.88% |
| DATE | 2195 | 1 | 3 | 99.82% | 99.95% | 99.86% | 99.91% |
| AGE | 176 | 7 | 2 | 95.14% | 96.17% | 98.88% | 97.51% |
| EMAIL | 2981 | 0 | 0 | 100.0% | 100.0% | 100.0% | 100.0% |
| CREDITCARDNUMBER | 601 | 57 | 35 | 86.72% | 91.34% | 94.50% | 92.89% |
| SEX | 103 | 45 | 1 | 69.13% | 69.59% | 99.04% | 81.75% |
| SOCIALNUM | 364 | 134 | 20 | 70.27% | 73.09% | 94.79% | 82.54% |
| TIME | 1631 | 1 | 3 | 99.76% | 99.94% | 99.82% | 99.88% |
| TELEPHONENUM | 3537 | 10 | 9 | 99.47% | 99.72% | 99.75% | 99.73% |
| IDCARDNUM | 1540 | 314 | 148 | 76.92% | 83.06% | 91.23% | 86.96% |
| ZIPCODE | 311 | 39 | 16 | 84.97% | 88.86% | 95.11% | 91.87% |
| DRIVERLICENSENUM | 296 | 143 | 26 | 63.66% | 67.43% | 91.93% | 77.79% |
| PASSPORTNUM | 482 | 285 | 25 | 60.86% | 62.84% | 95.07% | 75.67% |
| TITLE | 224 | 68 | 78 | 60.54% | 76.71% | 74.17% | 75.42% |
| BUILDINGNUM | 292 | 45 | 14 | 83.19% | 86.65% | 95.42% | 90.85% |
| STREET | 1272 | 155 | 67 | 85.14% | 89.14% | 94.99% | 91.97% |
| TAXNUM | 471 | 101 | 34 | 77.72% | 82.34% | 93.27% | 87.47% |
| GENDER | 123 | 35 | 9 | 73.65% | 77.85% | 93.18% | 84.83% |
### Overall Evaluation
- **Accuracy:** 95.03%
- **Precision:** 87.61%
- **Recall:** 95.76%
- **F1 Score:** 91.50%
- **Total True Positives (TP):** 27,771
- **Total False Positives (FP):** 3,928
- **Total False Negatives (FN):** 1,229
### Macro-Averaged Metrics
- **Accuracy:** 82.17%
- **Precision:** 80.99%
- **Recall:** 89.96%
- **F1 Score:** 84.91%
---
## Model Behavior & Limitations
- **Evaluation Focus:**
The metrics above reflect performance on the test split of the [open-pii-masking-500k-ai4privacy](https://huggingface.co/datasets/ai4privacy/open-pii-masking-500k-ai4privacy) dataset. This model both redacts and classifies PII into specific categories (e.g., GIVENNAME, EMAIL). Real-world performance may vary depending on text domain and language, so additional validation is recommended. For support, contact **[email protected]**.
- **Strengths:**
- High recall (95.76%) ensures most PII is detected.
- Exceptional performance on labels like "EMAIL" (100% F1), "DATE" (99.91% F1), and "TIME" (99.88% F1).
- **Limitations:**
- Lower precision for labels such as "PASSPORTNUM" (62.84%) and "DRIVERLICENSENUM" (67.43%), indicating a higher rate of false positives.
- The "O" (Non-PII) label has no true positives, making precision, recall, and F1 score not applicable (n/a).
---
## Disclaimer
This model card details the evaluation metrics and fine-tuning parameters for the multilingual anonymiser with PII classification capabilities. **Please note:**
- The model is provided **as-is** under the MIT License.
- It is intended for both redaction and PII classification purposes.
- Users should thoroughly test and evaluate its performance on their specific datasets before deploying in production environments.
---
*Ai4Privacy – Committed to protecting personal data in the age of AI.*
---