FLAN-T5-base Fine-tuned for Dhivehi-English-Latin Translation

This model is a fine-tuned version of google/flan-t5-base for multilingual translation between Dhivehi, English and Latin script.

Model description

The model was trained on a combination of:

  • English-Dhivehi parallel corpus
  • Dhivehi-Latin transliteration pairs

It supports the following translation directions:

  • English to Dhivehi (en2dv)
  • Dhivehi to English (dv2en)
  • Dhivehi to Latin script (dv2latin)
  • Latin script to Dhivehi (latin2dv)

Training and Evaluation

The model was trained for 3 epochs with the following parameters:

  • Learning rate: 3e-4
  • Batch size: 2
  • Weight decay: 0.01
  • Max sequence length: 128 tokens

Final training metrics:

  • Training loss: 1.305
  • Training runtime: 535,653 seconds
  • Training samples per second: 3.733

Final evaluation metrics:

  • Evaluation loss: 1.167
  • ROUGE-1: 0.415
  • ROUGE-2: 0.288
  • ROUGE-L: 0.402
  • ROUGE-Lsum: 0.405
  • Evaluation runtime: 131,307 seconds
  • Evaluation samples per second: 5.077

Usage

To use the model, you can load it using the from_pretrained method:

from transformers import T5ForConditionalGeneration, AutoTokenizer

model = T5ForConditionalGeneration.from_pretrained("alakxender/flan-t5-base-dhivehi-en-latin")
tokenizer = AutoTokenizer.from_pretrained("alakxender/flan-t5-base-dhivehi-en-latin")

supported_languages = ["en2dv", "dv2en", "dv2latin", "latin2dv"]

def translate(source_text, target_language):
    prompt = f"{target_language.strip()} {source_text.strip()}"
    inputs = tokenizer(prompt, return_tensors="pt", max_length=128, truncation=True)
    output_ids = model.generate(
        **inputs,
        max_length=128,
        min_length=10,
        num_beams=4,
        early_stopping=True,
        no_repeat_ngram_size=3,
        repetition_penalty=1.2,
        do_sample=False,
        pad_token_id=tokenizer.pad_token_id,
        eos_token_id=tokenizer.eos_token_id
    )
    result = tokenizer.decode(output_ids[0], skip_special_tokens=True)
    return result

# Example usage
source_text = "Concerns over prepayment of GST raised in parliament"
target_language = "en2dv"
translated_text = translate(source_text, target_language)
print(translated_text)
# Output: ރައްޔިތުންގެ މަޖިލީހުގައި ޖީއެސްޓީގެ އަގު ބޮޑުވުމާ ގުޅިގެން ކަންބޮޑުވުން ފާޅުކޮށްފި

source_text = "ދުނިޔޭގެ އެކި ކަންކޮޅުތަކުން 1.4 މިލިއަން މީހުން މައްކާއަށް ޖަމާވެފައި"
target_language = "dv2en"
translated_text = translate(source_text, target_language)
print(translated_text)
# Output: 1.4 million people gathered in Mecca from different parts of the world

source_text = "ވައިބާރުވުމުން ކުޅުދުއްފުށީ އެއާޕޯޓަށް ނުޖެއްސިގެން މޯލްޑިވިއަންގެ ބޯޓެއް އެނބުރި މާލެއަށް"
target_language = "dv2latin"
translated_text = translate(source_text, target_language)
print(translated_text)
# Output: Vaibaruvumun kulhudhuhfushee eaapoatah nujehsigen moaldiviange boateh enburi maaleah

source_text = "Paakisthaanuge skoolu bahakah dhin hamalaaehgai thin kuhjakaai bodu dhe meehaku maruvehje"
target_language = "latin2dv"
translated_text = translate(source_text, target_language)
print(translated_text)
# Output: ޕާކިސްތާނުގެ ސްކޫލު ބަހަކަށް ދިން ހަމަލާއެއްގައި ތިން ކުއްޖަކާއި ބޮޑު ދެ މީހަކު މަރުވެއްޖެ

*Note: The model was fine-tuned mostly on english-dhivehi local news articles. It may not perform well on other domains.

Downloads last month
169
Safetensors
Model size
248M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for alakxender/flan-t5-base-dhivehi-en-latin

Finetuned
(790)
this model

Datasets used to train alakxender/flan-t5-base-dhivehi-en-latin

Space using alakxender/flan-t5-base-dhivehi-en-latin 1

Collection including alakxender/flan-t5-base-dhivehi-en-latin