Edit model card

Model Card for Model ID

Model Trained for 8500 steps on <110k dataset.

Model Description

This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.

  • Model type: (Translation)
  • Language(s) (NLP, Nepali, English):
  • License: [Apache license 2.0]
  • Finetuned from model [google/mt5-small]:

How to Get Started with the Model

Use the code below to get started with the model.


from transformers import AutoTokenizer, MT5ForConditionalGeneration

checkpoint = "syubraj/RomanEng2Nep-v2"
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = MT5ForConditionalGeneration.from_pretrained(checkpoint)

# Set max sequence length
max_seq_len = 20

def translate(text):
    # Tokenize the input text with a max length of 20
    inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=max_seq_len)

    # Generate translation
    translated = model.generate(**inputs)

    # Decode the translated tokens back to text
    translated_text = tokenizer.decode(translated[0], skip_special_tokens=True)
    return translated_text

# Example usage
source_text = "muskuraudai"  # Example Romanized Nepali text
translated_text = translate(source_text)
print(f"Translated Text: {translated_text}")

Training Data

syubraj/roman2nepali-transliteration

Training Hyperparameters

  • Training regime:
training_args = Seq2SeqTrainingArguments(
    output_dir="/content/drive/MyDrive/romaneng2nep_v2/",
    eval_strategy="steps",
    learning_rate=2e-5,
    per_device_train_batch_size=16,
    per_device_eval_batch_size=8,
    weight_decay=0.01,
    save_total_limit=3,
    num_train_epochs=2,
    predict_with_generate=True,
)

Training and Validation Metrics

Step Training Loss Validation Loss Gen Len
500 21.636200 9.776628 2.001900
1000 10.103400 6.105016 2.077900
1500 6.830800 5.081259 3.811600
2000 6.003100 4.702793 4.237300
2500 5.690200 4.469123 4.700000
3000 5.443100 4.274406 4.808300
3500 5.265300 4.121417 4.749400
4000 5.128500 3.989708 4.782300
4500 5.007200 3.885391 4.805100
5000 4.909600 3.787640 4.874800
5500 4.836000 3.715750 4.855500
6000 4.733000 3.640963 4.962000
6500 4.673500 3.587330 5.011600
7000 4.623800 3.531883 5.068300
7500 4.567400 3.481622 5.108500
8000 4.523200 3.445404 5.092700
8500 4.464000 3.413630 5.132700
Downloads last month
0
Safetensors
Model size
300M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for syubraj/RomanEng2Nep-v2

Base model

google/mt5-small
Finetuned
(270)
this model
Quantizations
1 model

Dataset used to train syubraj/RomanEng2Nep-v2