mbart-neutralization
This model is a fine-tuned version of facebook/mbart-large-50 on the hackathon-pln-es/neutral-es dataset.
It learns to paraphrase gender-marked expressions into an inclusive style. For example, "La enfermera me curó" → "El personal sanitario me curó", thereby promoting more inclusive language.
It achieves the following results on the evaluation set:
- Loss: 0.0118
- BLEU: 63.5448
- Generation length: 36.7604
Model description
mBART-50 is a pretrained multilingual encoder–decoder (sequence-to-sequence) model that supports 50 languages. It was designed to show that, instead of fine-tuning a separate model for each language pair, a single pre-trained model can be fine-tuned simultaneously on multiple translation directions. Building on the original mBART, it extends coverage by adding 25 more languages (for a total of 50), delivering a truly multilingual solution.
During pre-training, mBART-50 employs a denoising autoencoding objective: monolingual sentences are “noised” by randomly shuffling their order and span-masking a portion of tokens, and the model learns to reconstruct the original text.
Intended uses
Reducing gender bias in Spanish texts via monolingual style transfer.
Preprocessing step in NLP pipelines (e.g. for editorial tools or inclusive content generation).
As a basis for further fine-tuning on related sequence-to-sequence tasks (summarization, paraphrasing).
Limitations
Only neutralizes gendered expressions in Spanish; it does not translate between languages.
Quality may degrade on domain-specific or very technical texts outside the training distribution.
May occasionally produce ungrammatical or awkward phrasing when forced to alter rare word combinations.
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5.6e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 2
Training results
Training Loss | Epoch | Step | Validation Loss | Bleu | Gen Len |
---|---|---|---|---|---|
No log | 1.0 | 440 | 0.0151 | 88.2841 | 34.8125 |
0.2281 | 2.0 | 880 | 0.0118 | 63.5448 | 36.7604 |
Framework versions
- Transformers 4.49.0
- Pytorch 2.5.1+cu124
- Datasets 3.3.2
- Tokenizers 0.21.0
- Downloads last month
- 37
Model tree for rebego/mbart-neutralization
Base model
facebook/mbart-large-50