--- library_name: transformers license: apache-2.0 base_model: Helsinki-NLP/opus-mt-en-fr tags: - translation - generated_from_trainer datasets: - kde4 metrics: - bleu model-index: - name: marian-finetuned-kde4-en-to-fr results: - task: name: Sequence-to-sequence Language Modeling type: text2text-generation dataset: name: kde4 type: kde4 config: en-fr split: train args: en-fr metrics: - name: Bleu type: bleu value: 50.54449537679619 --- # Marian Fine-Tuned KDE4 (English-to-French) This model is a fine-tuned version of [Helsinki-NLP/opus-mt-en-fr](https://huggingface.co/Helsinki-NLP/opus-mt-en-fr) using the KDE4 dataset. It achieves the following results on the evaluation set: - **Loss**: 0.9620 - **BLEU**: 50.5445 --- ## Model Description This English-to-French translation model has been fine-tuned specifically on the KDE4 dataset. The base model, Helsinki-NLP/opus-mt-en-fr, is part of the MarianMT family, renowned for its efficiency and high-quality neural machine translation capabilities. --- ## Intended Uses & Limitations ### Intended Uses - Translating English text into French. - High-quality translations in the context of software localization, especially related to KDE4. ### Limitations - Performance may decline on texts outside the KDE4 domain. - Struggles with idiomatic expressions, specialized technical jargon, or ambiguous content. --- ## Training & Evaluation Data The model was fine-tuned on the KDE4 dataset, a specialized resource for machine translation in software localization. The evaluation metrics reflect the model's performance on this domain-specific data. --- ## Training Procedure ### Hyperparameters - **Learning Rate**: 2e-05 - **Train Batch Size**: 32 - **Eval Batch Size**: 64 - **Seed**: 42 - **Optimizer**: AdamW with `betas=(0.9, 0.999)`, `epsilon=1e-08` - **LR Scheduler**: Linear - **Epochs**: 1 - **Mixed Precision Training**: Native AMP ### Results - **Loss**: 0.9620 - **BLEU**: 50.5445 ### Training Loss Progression | Step | Training Loss | |-------|---------------| | 500 | 1.2253 | | 1000 | 1.2165 | | 1500 | 1.1913 | | 2000 | 1.1404 | | 2500 | 1.1178 | | 3000 | 1.0900 | | 3500 | 1.0594 | | 4000 | 1.0512 | | 4500 | 1.0633 | | 5000 | 1.0405 | | 5500 | 1.0316 | --- ## Framework Versions - **Transformers**: 4.47.1 - **PyTorch**: 2.5.1+cu121 - **Datasets**: 3.2.0 - **Tokenizers**: 0.21.0 --- ## Example Usage ```python from transformers import pipeline # Load the model model_checkpoint = "ParitKansal/marian-finetuned-kde4-en-to-fr" translator = pipeline("translation", model=model_checkpoint) # Translate text translation = translator("Default to expanded threads") print(translation) ``` This script demonstrates how to use the model for English-to-French translation tasks. ---