ai4privacy/llama-ai4privacy-multilingual-anonymiser-openpii

Accuracy: 98.35%
Precision: 95.24%
Recall: 93.35%
F1 Score: 94.29%

Evaluation Metrics

The table below summarizes the detailed evaluation results per PII label:

Label	TP	FP	FN	Accuracy	Precision	Recall	F1 Score
SURNAME	3722	0	28	99.25%	100.0%	99.25%	99.63%
O (Non-PII)	0	400	0	99.30%	n/a	n/a	n/a
TIME	1936	0	0	100.0%	100.0%	100.0%	100.0%
DRIVERLICENSENUM	505	0	2	99.61%	100.0%	99.61%	99.80%
PASSPORTNUM	564	0	2	99.65%	100.0%	99.65%	99.82%
GIVENNAME	7548	0	172	97.77%	100.0%	97.77%	98.87%
TELEPHONENUM	3641	0	0	100.0%	100.0%	100.0%	100.0%
BUILDINGNUM	407	0	19	95.54%	100.0%	95.54%	97.72%
AGE	168	0	1	99.41%	100.0%	99.41%	99.70%
DATE	2335	0	0	100.0%	100.0%	100.0%	100.0%
CITY	1672	0	130	92.79%	100.0%	92.79%	96.26%
TITLE	349	0	35	90.89%	100.0%	90.89%	95.23%
IDCARDNUM	1998	0	22	98.91%	100.0%	98.91%	99.45%
GENDER	121	0	0	100.0%	100.0%	100.0%	100.0%
CREDITCARDNUMBER	557	0	1	99.82%	100.0%	99.82%	99.91%
SEX	78	0	1	98.73%	100.0%	98.73%	99.36%
STREET	1368	0	19	98.63%	100.0%	98.63%	99.31%
TAXNUM	345	0	12	96.64%	100.0%	96.64%	98.29%
EMAIL	2606	0	2	99.92%	100.0%	99.92%	99.96%
SOCIALNUM	411	0	11	97.39%	100.0%	97.39%	98.68%
ZIPCODE	406	0	20	95.31%	100.0%	95.31%	97.60%

Evaluation Focus:
The metrics shown above reflect performance on the test split of the open-pii-masking-500k-ai4privacy dataset. Real-world performance may vary and requires additional measures. Feel free to contact [email protected] for assistance.

This model card details the evaluation metrics and fine-tuning parameters for the multilingual anonymiser. Please note:

The model is provided as-is under the MIT License.
It is intended solely for redaction purposes and does not perform full PII classification.
Users should carefully test and evaluate its performance on their own data before deploying in production environments.

Ai4Privacy – Committed to protecting personal data in the age of AI.