mbart-neutralization

This model is a fine-tuned version of facebook/mbart-large-50 on the hackathon-pln-es/neutral-es dataset.

It learns to paraphrase gender-marked expressions into an inclusive style. For example, "La enfermera me curó" → "El personal sanitario me curó", thereby promoting more inclusive language.

It achieves the following results on the evaluation set:

Loss: 0.0118
BLEU: 63.5448
Generation length: 36.7604

Model description

mBART-50 is a pretrained multilingual encoder–decoder (sequence-to-sequence) model that supports 50 languages. It was designed to show that, instead of fine-tuning a separate model for each language pair, a single pre-trained model can be fine-tuned simultaneously on multiple translation directions. Building on the original mBART, it extends coverage by adding 25 more languages (for a total of 50), delivering a truly multilingual solution.

During pre-training, mBART-50 employs a denoising autoencoding objective: monolingual sentences are “noised” by randomly shuffling their order and span-masking a portion of tokens, and the model learns to reconstruct the original text.

Intended uses

Reducing gender bias in Spanish texts via monolingual style transfer.
Preprocessing step in NLP pipelines (e.g. for editorial tools or inclusive content generation).
As a basis for further fine-tuning on related sequence-to-sequence tasks (summarization, paraphrasing).

Limitations

Only neutralizes gendered expressions in Spanish; it does not translate between languages.
Quality may degrade on domain-specific or very technical texts outside the training distribution.
May occasionally produce ungrammatical or awkward phrasing when forced to alter rare word combinations.

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5.6e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 2

Training results

Training Loss	Epoch	Step	Validation Loss	Bleu	Gen Len
No log	1.0	440	0.0151	88.2841	34.8125
0.2281	2.0	880	0.0118	63.5448	36.7604

Framework versions

Transformers 4.49.0
Pytorch 2.5.1+cu124
Datasets 3.3.2
Tokenizers 0.21.0

rebego
/

mbart-neutralization