File size: 2,279 Bytes
dfd8fb0
 
e9d80ba
 
 
 
 
 
 
 
 
 
 
 
 
296663d
dfd8fb0
 
 
 
 
 
c987408
 
 
dfd8fb0
 
 
 
 
 
 
e9d80ba
 
dfd8fb0
e9d80ba
 
 
 
dfd8fb0
e9d80ba
 
dfd8fb0
e9d80ba
 
 
dfd8fb0
e9d80ba
 
dfd8fb0
e9d80ba
 
 
dfd8fb0
e9d80ba
 
 
 
dfd8fb0
e9d80ba
dfd8fb0
 
e9d80ba
c987408
 
 
 
 
 
 
 
 
 
 
 
 
 
dfd8fb0
e9d80ba
dfd8fb0
e9d80ba
296663d
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
---
library_name: transformers
tags:
- roman eng2nep
- translation
- transliteration
license: mit
datasets:
- syubraj/roman2nepali-transliteration
language:
- en
- ne
base_model:
- google-t5/t5-base
pipeline_tag: translation
new_version: syubraj/RomanEng2Nep-v2
---


This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.

- **Model type:** [More Information Needed]
- **Language(s) (NLP):** [Roman Eng, Nep]
- **License:** [MIT]
- **Finetuned from model [google-t5/t5-small]:** 

<!-- Provide the basic links for the model. -->


## How to Get Started with the Model

Use the code below to get started with the model.
```Python
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

# Load your fine-tuned model and tokenizer
model_name = 'syubraj/romaneng2nep'
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)

# Set max sequence length
max_seq_len = 30

def translate(text):
    # Tokenize the input text with a max length of 30
    inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=max_seq_len)

    # Generate translation
    translated = model.generate(**inputs)

    # Decode the translated tokens back to text
    translated_text = tokenizer.decode(translated[0], skip_special_tokens=True)
    return translated_text

# Example usage
source_text = "timilai kasto cha?"  # Example Romanized Nepali text
translated_text = translate(source_text)
print(f"Translated Text: {translated_text}")

```


## Training Details
```Python
training_args = Seq2SeqTrainingArguments(
    output_dir="/kaggle/working/romaneng2nep/",
    eval_strategy="epoch",
    learning_rate=2e-5,
    per_device_train_batch_size=16,
    per_device_eval_batch_size=16,
    weight_decay=0.01,
    save_total_limit=3,
    num_train_epochs=3,
    predict_with_generate=True,
    fp16=True,
)
```

### Training Data

<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
[syubraj/roman2nepali-transliteration](https://huggingface.co/datasets/syubraj/roman2nepali-transliteration)