BERT Multilingual Cased NER β Optimized and Quantized for Spanish Legal Texts
Model Description
This model is an optimized and quantized version of Davlan/bert-base-multilingual-cased-ner-hrl, tailored for Named Entity Recognition (NER) tasks in Spanish legal documents. The original model was exported to ONNX format and underwent static quantization to int8 precision using the π€ Optimum library. Calibration was performed with a dataset comprising Spanish legal texts to enhance performance in this specific domain.
Usage
To utilize this model, ensure that the optimum
library is installed. Here's an example of how to load and use the model for NER tasks:
from transformers import AutoTokenizer, pipeline
from optimum.onnxruntime import ORTModelForTokenClassification
tokenizer = AutoTokenizer.from_pretrained("raul-delarosa99/bert-base-multilingual-cased-ner-es-onnx-static-int8")
model = ORTModelForTokenClassification.from_pretrained("raul-delarosa99/bert-base-multilingual-cased-ner-es-onnx-static-int8")
nlp_ner = pipeline(
"ner",
model=model,
tokenizer=tokenizer,
aggregation_strategy="simple"
)
nlp_ner("Hola, soy Pedro y vivo en Toluca.")
Note: This model requires the optimum
library for proper functionality. Loading it with AutoModelForTokenClassification
from the standard transformers
library may result in errors due to missing files specific to PyTorch.
Limitations
- Domain Specificity: The quantization calibration was performed using Spanish legal texts, which may affect performance in other domains or languages.
- Quantization Effects: While quantization reduces model size and increases inference speed, it may introduce slight degradations in accuracy.
Citation
If you use this model, please do not forget to cite the original base model:
@misc{davlan2021bertner,
title={BERT base multilingual cased NER},
author={Davlan, B.},
year={2021},
howpublished={\url{https://huggingface.co/Davlan/bert-base-multilingual-cased-ner-hrl}}
}
- Downloads last month
- 24
Model tree for raul-delarosa99/bert-base-multilingual-cased-ner-es-onnx-static-int8
Base model
Davlan/bert-base-multilingual-cased-ner-hrl