---
license: apache-2.0
base_model: google-bert/bert-base-cased
tags:
- generated_from_trainer
metrics:
- precision
- recall
- f1
- accuracy
model-index:
- name: ner-portuguese
  results: []
widget:
- text: >-
    Alexandre Telles foi exonerado nesta segunda-feira, assim como o secretário
    nacional de Atenção Especializada à Saúde, Helvécio Magalhães. As mudanças
    se deram depois de muita pressão política sobre Nísia e de reportagem do
    Fantástico, da TV Globo, mostrar no domingo as condições precárias dos
    hospitais na cidade.e
  example_title: Exemple 1
- text: >-
    Os elementos de prova colhidos corroboram as afirmações prestadas pelo
    colaborador MAURO CESAR BARBOSA CID, demonstrando que, por ordem do então
    Presidente JAIR BOLSONARO, MAURO CESAR CID solicitou a AILTON BARROS a
    inserção dos dados falsos de vacinação contra a Covid-19 em benefício do
    ex-Presidente da República e de sua filha”, afirma a PF.
  example_title: Example 2
- text: >-
    De acordo com a polícia, parte dos detidos foi identificado como autores de
    um assalto recente a uma farmácia na região do Morumbi, na zona sul da
    capital paulista. Todos já tinham passagens por outros crimes. O caso foi
    registrado na 5ª delegacia da Divisão de Investigações sobre Crimes contra o
    Patrimônio (DISCCPAT) como roubo e receptação, ambos qualificados, posse
    ilegal de arma de fogo de uso restrito, associação criminosa e adulteração
    de sinal veicular identificador.
  example_title: Example 3
- text: >-
    Dois legumes são suficientes para que você sinta o sabor de ambos no prato.
    Um pode ser mais macio e outro mais firme, como cenoura ou abóbora. Pense em
    um legume que dará saciedade e outro mais refrescante
  example_title: Exemple 4
language:
- pt
library_name: transformers
pipeline_tag: token-classification
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# ner-portuguese-br-bert-cased

This model aims to help reduce the need for models in Portuguese.

## How to use:

```python
from transformers import BertForTokenClassification, DistilBertTokenizerFast, pipeline

model = BertForTokenClassification.from_pretrained('rhaymison/ner-portuguese-br-bert-cased')
tokenizer = DistilBertTokenizerFast.from_pretrained('rhaymison/ner-portuguese-br-bert-cased'
                                                    , model_max_length=512
                                                    , do_lower_case=False
                                                    )

nlp = pipeline('ner', model=model, tokenizer=tokenizer, grouped_entities=True)

result = nlp(f"""
A passagem de uma frente fria pelo Rio Grande do Sul e Santa Catarina mantém o tempo instável,
e chove a qualquer hora nos dois estados. Há risco de temporais no sul e leste gaúcho.
No Paraná segue quente, e pancadas de chuva ocorrem a partir da tarde, também com risco de temporais.
""")

###output

[{'entity_group': 'LOC',
  'score': 0.99812114,
  'word': 'Rio Grande do Sul',
  'start': 36,
  'end': 53},
 {'entity_group': 'LOC',
  'score': 0.99795854,
  'word': 'Santa Catarina',
  'start': 56,
  'end': 70},
 {'entity_group': 'LOC',
  'score': 0.997009,
  'word': 'Paraná',
  'start': 186,
  'end': 192}]

```


He has various named classes. Follow the list below:
- `O`: 0
- `B-ANIM`: 1
- `B-BIO`: 2
- `B-CEL`: 3
- `B-DIS`: 4
- `B-EVE`: 5
- `B-FOOD`: 6
- `B-INST`: 7
- `B-LOC`: 8
- `B-MEDIA`: 9
- `B-MYTH`: 10
- `B-ORG`: 11
- `B-PER`: 12
- `B-PLANT`: 13
- `B-TIME`: 14
- `B-VEHI`: 15
- `I-ANIM`: 16
- `I-BIO`: 17
- `I-CEL`: 18
- `I-DIS`: 19
- `I-EVE`: 20
- `I-FOOD`: 21
- `I-INST`: 22
- `I-LOC`: 23
- `I-MEDIA`: 24
- `I-MYTH`: 25
- `I-ORG`: 26
- `I-PER`: 27
- `I-PLANT`: 28
- `I-TIME`: 29
- `I-VEHI`: 30


This model is a fine-tuned version of [google-bert/bert-base-cased](https://huggingface.co/google-bert/bert-base-cased) on the MultNERD dataset.
It achieves the following results on the evaluation set:
- Loss: 0.0618
- Precision: 0.8965
- Recall: 0.8815
- F1: 0.8889
- Accuracy: 0.9810

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 4
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 8
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 1
- mixed_precision_training: Native AMP

### Training results

| Training Loss | Epoch | Step  | Validation Loss | Precision | Recall | F1     | Accuracy |
|:-------------:|:-----:|:-----:|:---------------:|:---------:|:------:|:------:|:--------:|
| 0.3792        | 0.03  | 500   | 0.2062          | 0.6752    | 0.6537 | 0.6642 | 0.9522   |
| 0.1822        | 0.06  | 1000  | 0.1587          | 0.7685    | 0.7267 | 0.7470 | 0.9618   |
| 0.152         | 0.08  | 1500  | 0.1407          | 0.7932    | 0.7675 | 0.7802 | 0.9663   |
| 0.1385        | 0.11  | 2000  | 0.1240          | 0.8218    | 0.7863 | 0.8037 | 0.9693   |
| 0.1216        | 0.14  | 2500  | 0.1129          | 0.8529    | 0.7850 | 0.8175 | 0.9710   |
| 0.1192        | 0.17  | 3000  | 0.1059          | 0.8520    | 0.7917 | 0.8208 | 0.9717   |
| 0.1165        | 0.2   | 3500  | 0.1053          | 0.8373    | 0.8071 | 0.8220 | 0.9717   |
| 0.0997        | 0.23  | 4000  | 0.0978          | 0.8434    | 0.8212 | 0.8322 | 0.9729   |
| 0.0938        | 0.25  | 4500  | 0.0963          | 0.8393    | 0.8313 | 0.8353 | 0.9736   |
| 0.0921        | 0.28  | 5000  | 0.0867          | 0.8593    | 0.8365 | 0.8478 | 0.9750   |
| 0.0943        | 0.31  | 5500  | 0.0846          | 0.8704    | 0.8268 | 0.8480 | 0.9754   |
| 0.0921        | 0.34  | 6000  | 0.0832          | 0.8556    | 0.8384 | 0.8469 | 0.9750   |
| 0.0936        | 0.37  | 6500  | 0.0802          | 0.8726    | 0.8361 | 0.8540 | 0.9760   |
| 0.0854        | 0.39  | 7000  | 0.0780          | 0.8749    | 0.8452 | 0.8598 | 0.9767   |
| 0.082         | 0.42  | 7500  | 0.0751          | 0.8812    | 0.8472 | 0.8639 | 0.9773   |
| 0.0761        | 0.45  | 8000  | 0.0745          | 0.8752    | 0.8571 | 0.8660 | 0.9772   |
| 0.0799        | 0.48  | 8500  | 0.0752          | 0.8635    | 0.8530 | 0.8582 | 0.9767   |
| 0.0728        | 0.51  | 9000  | 0.0746          | 0.8938    | 0.8398 | 0.8660 | 0.9780   |
| 0.0787        | 0.54  | 9500  | 0.0715          | 0.8791    | 0.8552 | 0.8670 | 0.9780   |
| 0.0721        | 0.56  | 10000 | 0.0707          | 0.8822    | 0.8598 | 0.8709 | 0.9785   |
| 0.0729        | 0.59  | 10500 | 0.0682          | 0.8775    | 0.8743 | 0.8759 | 0.9790   |
| 0.0707        | 0.62  | 11000 | 0.0686          | 0.8797    | 0.8696 | 0.8746 | 0.9789   |
| 0.0726        | 0.65  | 11500 | 0.0683          | 0.8944    | 0.8497 | 0.8715 | 0.9788   |
| 0.0689        | 0.68  | 12000 | 0.0667          | 0.8931    | 0.8609 | 0.8767 | 0.9795   |
| 0.0735        | 0.7   | 12500 | 0.0673          | 0.8742    | 0.8815 | 0.8779 | 0.9791   |
| 0.0725        | 0.73  | 13000 | 0.0666          | 0.8849    | 0.8713 | 0.8781 | 0.9796   |
| 0.0684        | 0.76  | 13500 | 0.0656          | 0.8881    | 0.8728 | 0.8804 | 0.9799   |
| 0.0736        | 0.79  | 14000 | 0.0644          | 0.8948    | 0.8677 | 0.8811 | 0.9800   |
| 0.0663        | 0.82  | 14500 | 0.0644          | 0.8844    | 0.8764 | 0.8803 | 0.9798   |
| 0.0652        | 0.85  | 15000 | 0.0645          | 0.8778    | 0.8845 | 0.8812 | 0.9797   |
| 0.0672        | 0.87  | 15500 | 0.0644          | 0.8788    | 0.8807 | 0.8797 | 0.9796   |
| 0.0625        | 0.9   | 16000 | 0.0630          | 0.8889    | 0.8819 | 0.8854 | 0.9804   |
| 0.0712        | 0.93  | 16500 | 0.0621          | 0.8913    | 0.8818 | 0.8866 | 0.9806   |
| 0.0629        | 0.96  | 17000 | 0.0618          | 0.8965    | 0.8815 | 0.8889 | 0.9810   |
| 0.0649        | 0.99  | 17500 | 0.0618          | 0.8953    | 0.8806 | 0.8879 | 0.9809   |


### Framework versions

- Transformers 4.38.2
- Pytorch 2.2.1+cu121
- Datasets 2.18.0
- Tokenizers 0.15.2

### Comments

Any idea, help or report will always be welcome.

email: rhaymisoncristian@gmail.com

 <div style="display:flex; flex-direction:row; justify-content:left">
    <a href="https://www.linkedin.com/in/heleno-betini-2b3016175/" target="_blank">
    <img src="https://img.shields.io/badge/LinkedIn-0077B5?style=for-the-badge&logo=linkedin&logoColor=white">
  </a>
  <a href="https://github.com/rhaymisonbetini" target="_blank">
    <img src="https://img.shields.io/badge/GitHub-100000?style=for-the-badge&logo=github&logoColor=white">
  </a>
 </div>