|
--- |
|
base_model: |
|
- google/mt5-small |
|
datasets: |
|
- syubraj/roman2nepali-transliteration |
|
language: |
|
- ne |
|
- en |
|
library_name: transformers |
|
license: apache-2.0 |
|
pipeline_tag: translation |
|
tags: |
|
- nepali |
|
- roman english |
|
- translation |
|
- transliteration |
|
new_version: syubraj/romaneng2nep |
|
--- |
|
|
|
# Model Card for Model ID |
|
|
|
Model Trained for 8500 steps on <110k dataset. |
|
|
|
|
|
|
|
|
|
|
|
### Model Description |
|
|
|
<!-- Provide a longer summary of what this model is. --> |
|
|
|
This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated. |
|
|
|
- **Model type:** (google/mt5-small) |
|
- **Language(s) (NLP, Nepali, English):** |
|
- **License:** [Apache license 2.0] |
|
- **Finetuned from model [google/mt5-small]:** |
|
|
|
### Model Sources [optional] |
|
|
|
<!-- Provide the basic links for the model. --> |
|
|
|
- **Repository:** [More Information Needed] |
|
- **Paper [optional]:** [More Information Needed] |
|
- **Demo [optional]:** [More Information Needed] |
|
|
|
|
|
## How to Get Started with the Model |
|
|
|
Use the code below to get started with the model. |
|
|
|
```Python |
|
|
|
from transformers import AutoTokenizer, MT5ForConditionalGeneration |
|
|
|
checkpoint = "syubraj/RomanEng2Nep-v2" |
|
tokenizer = AutoTokenizer.from_pretrained(checkpoint) |
|
model = MT5ForConditionalGeneration.from_pretrained(checkpoint) |
|
|
|
# Set max sequence length |
|
max_seq_len = 20 |
|
|
|
def translate(text): |
|
# Tokenize the input text with a max length of 20 |
|
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=max_seq_len) |
|
|
|
# Generate translation |
|
translated = model.generate(**inputs) |
|
|
|
# Decode the translated tokens back to text |
|
translated_text = tokenizer.decode(translated[0], skip_special_tokens=True) |
|
return translated_text |
|
|
|
# Example usage |
|
source_text = "timilai kasto cha?" # Example Romanized Nepali text |
|
translated_text = translate(source_text) |
|
print(f"Translated Text: {translated_text}") |
|
``` |
|
|
|
|
|
|
|
### Training Data |
|
|
|
[syubraj/roman2nepali-transliteration](https://huggingface.co/datasets/syubraj/roman2nepali-transliteration) |
|
|
|
|
|
#### Training Hyperparameters |
|
|
|
- **Training regime:** |
|
```Python |
|
training_args = Seq2SeqTrainingArguments( |
|
output_dir="/content/drive/MyDrive/romaneng2nep_v2/", |
|
eval_strategy="steps", |
|
learning_rate=2e-5, |
|
per_device_train_batch_size=16, |
|
per_device_eval_batch_size=8, |
|
weight_decay=0.01, |
|
save_total_limit=3, |
|
num_train_epochs=2, |
|
predict_with_generate=True, |
|
) |
|
``` |
|
|
|
## Training and Validation Metrics |
|
|
|
| Step | Training Loss | Validation Loss | Gen Len | |
|
|------|---------------|-----------------|---------| |
|
| 500 | 21.636200 | 9.776628 | 2.001900 | |
|
| 1000 | 10.103400 | 6.105016 | 2.077900 | |
|
| 1500 | 6.830800 | 5.081259 | 3.811600 | |
|
| 2000 | 6.003100 | 4.702793 | 4.237300 | |
|
| 2500 | 5.690200 | 4.469123 | 4.700000 | |
|
| 3000 | 5.443100 | 4.274406 | 4.808300 | |
|
| 3500 | 5.265300 | 4.121417 | 4.749400 | |
|
| 4000 | 5.128500 | 3.989708 | 4.782300 | |
|
| 4500 | 5.007200 | 3.885391 | 4.805100 | |
|
| 5000 | 4.909600 | 3.787640 | 4.874800 | |
|
| 5500 | 4.836000 | 3.715750 | 4.855500 | |
|
| 6000 | 4.733000 | 3.640963 | 4.962000 | |
|
| 6500 | 4.673500 | 3.587330 | 5.011600 | |
|
| 7000 | 4.623800 | 3.531883 | 5.068300 | |
|
| 7500 | 4.567400 | 3.481622 | 5.108500 | |
|
| 8000 | 4.523200 | 3.445404 | 5.092700 | |
|
| 8500 | 4.464000 | 3.413630 | 5.132700 | |
|
|
|
|
|
|