Gulbert-ft-ita
This model can be used for multi-label classification of Italian legislative acts, according to the subject index (taxonomy) currently adopted in the Gazzetta Uffciale. The model has been obtained by fine-tuning a BERT-XXL Italian model on a large corpus of legislative acts published in the Gazzetta Ufficiale from 1988 until early 2022.
Model Details
Model Description
- Language(s) (NLP): Italian
- License: apache-2.0
- Finetuned from model: https://huggingface.co/dbmdz/bert-base-italian-xxl-uncased
Model Sources
- Repository: https://huggingface.co/dhfbk
- Paper: M. Rovera, A. Palmero Aprosio, F. Greco, M. Lucchese, S. Tonelli and A. Antetomaso (2023) Italian Legislative Text Classification for Gazzetta Ufficiale. In Proceedings of the Fifth Natural Legal Language Workshop (NLLP2023).
- Demo: https://dh-server.fbk.eu/ipzs-ui-demo/
Uses
Direct Use
Multi-label text classification of Italian legislative acts.
Training Details
Training Data
The dataset used for training the model can be retrieved at our GitHub account and is further documented in the above mentioned paper.
Evaluation
Results
The model achieves a micro-F1 score of 0.873, macro-F1 of 0.471 and a weighted-F1 of 0.864 on the test set (3-fold average).
Citation
BibTeX:
@inproceedings{rovera-etal-2023-italian,
title = "{I}talian Legislative Text Classification for Gazzetta Ufficiale",
author = "Rovera, Marco and
Palmero Aprosio, Alessio and
Greco, Francesco and
Lucchese, Mariano and
Tonelli, Sara and
Antetomaso, Antonio",
booktitle = "Proceedings of the Natural Legal Language Processing Workshop 2023",
year = "2023",
address = "Singapore",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2023.nllp-1.6",
pages = "44--50"
}
- Downloads last month
- 11
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.