nllb-200-distilled-1.3B-finetuned-py2cpp

This model is a fine-tuned version of facebook/nllb-200-distilled-1.3B on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.8596
  • Bleu: 64.3588
  • Gen Len: 75.1636

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
No log 0.99 33 3.0707 25.7972 101.6182
No log 2.0 67 2.1456 27.0277 97.5818
No log 2.99 100 1.5575 29.1338 94.5455
No log 4.0 134 1.2533 51.6222 75.1273
No log 4.99 167 1.0750 57.0794 76.2
No log 6.0 201 0.9742 60.7142 74.1455
No log 6.99 234 0.9128 63.8188 72.6182
No log 8.0 268 0.8811 63.7263 75.1455
No log 8.99 301 0.8642 64.1829 75.0909
No log 9.85 330 0.8596 64.3588 75.1636

Framework versions

  • Transformers 4.33.1
  • Pytorch 2.4.0
  • Datasets 3.0.1
  • Tokenizers 0.13.3
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model's library.

Model tree for hugo-albert/nllb-200-distilled-1.3B-finetuned-py2cpp

Finetuned
(7)
this model