English-Dhivehi MT5 Translation Model

This is a bilingual machine translation model fine-tuned from the google/mt5-base architecture to support translation between English and Dhivehi in both directions. The model is instruction-prefixed using "2dv" for English→Dhivehi and "2en" for Dhivehi→English.

Performance

BLEU Scores

English → Dhivehi: 24.32
Dhivehi → English: 50.79

These scores reflect the relative ease of generating fluent English from Dhivehi input compared to the more complex morphological and syntactic challenges when generating Dhivehi from English.

Example Translations

English → Dhivehi

Input: Hello, how are you?
Output: ހެލޯ، ކިހިނެއްވީ؟

Input: I love reading books.
Output: އަހަރެން ފޮތް ކިޔަން ވަރަށް ލޯބިވޭ.

Dhivehi → English

Input: ކިހިނެއްތަ އުޅެނީ؟
Output: how's it going?

Input: ރާއްޖެއަކީ ރީތި ޤައުމެކެވެ.
Output: sri lanka is a beautiful country. (Note: dataset quality may affect accuracy)

Usage

from transformers import MT5ForConditionalGeneration, MT5Tokenizer

# Load model and tokenizer
model = MT5ForConditionalGeneration.from_pretrained("./mt5-base-dv-en/best_model")
tokenizer = MT5Tokenizer.from_pretrained("./mt5-base-dv-en/best_model")

# English to Dhivehi
def translate_to_dhivehi(text):
    inputs = tokenizer("2dv" + text, return_tensors="pt", max_length=512, truncation=True)
    outputs = model.generate(
        **inputs,
        max_length=100,
        num_beams=5,
        length_penalty=2.5,
        repetition_penalty=1.5,
        early_stopping=True,
        no_repeat_ngram_size=2
    )
    return tokenizer.decode(outputs[0], skip_special_tokens=True)

# Dhivehi to English
def translate_to_english(text):
    inputs = tokenizer("2en" + text, return_tensors="pt", max_length=512, truncation=True)
    outputs = model.generate(
        **inputs,
        max_length=100,
        num_beams=5,
        length_penalty=2.5,
        repetition_penalty=1.5,
        early_stopping=True,
        no_repeat_ngram_size=2
    )
    return tokenizer.decode(outputs[0], skip_special_tokens=True)

# Example usage
print(translate_to_dhivehi("Hello, how are you?"))
print(translate_to_english("ކިހިނެއްތަ އުޅެނީ؟"))

Model Architecture

Base Model: google/mt5-base
Instruction Prefixes:
- "2dv" for English → Dhivehi
- "2en" for Dhivehi → English

Disclaimer

This model is an experimental and educational fine-tuning intended to explore low-resource translation capabilities for the Dhivehi language. It is not production-ready and should not be used in high-stakes applications without further evaluation and refinement.

alakxender
/

mt5-base-dv-en