Ghomala Translation Model

This is a neural machine translation model fine-tuned to translate from English to Ghomala, a Bantu language spoken in Cameroon.

🚀 Architecture

  • Encoder: UBC-NLP/serengeti-E250
  • Decoder: gpt2

🏋️ Training Details

  • Fine-tuned on custom parallel Bible + text data
  • Epochs: 10
  • Learning rate: 2e-5
  • BLEU score tracked with evaluate
  • Batch size: 2 (with gradient accumulation)
  • Optimizer: AdamW

📌 Usage Example

from transformers import pipeline

translator = pipeline("translation", model="DS4H-ICTU/english-ghomala-translation-model-encoderdecoder")
result = translator("The woman gave water to the prophet.")
print(result)

🎯 Intended Use

  • Cultural and educational preservation
  • Language learning and community translation tools

⚠️ Limitations

  • Still learning with limited Ghomala data
  • May hallucinate or repeat translations
  • Works only in English → Ghomala direction for now

📚 Citation

@misc{ghomala_translation_model,
  title={Ghomala Translation Model},
  author={Group 2},
  howpublished={\url{https://huggingface.co/DS4H-ICTU/english-ghomala-translation-model-encoderdecoder}},
  year={2025}
}

Downloads last month
10
Safetensors
Model size
430M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support