NLP
Collection
Dhivehi Natural Language Processing: Text analysis, translation, sentiment analysis, and language generation tools for Thaana
•
27 items
•
Updated
This is a GPT-2 model fine-tuned on Dhivehi language texts. The model was trained on a curated dataset of Dhivehi Wikipedia articles and can be used for text generation in the Dhivehi language.
Evaluation metrics on the test set:
from transformers import GPT2LMHeadModel, GPT2TokenizerFast
# Load model and tokenizer
model = GPT2LMHeadModel.from_pretrained("alakxender/dhivehi-gpt2-base")
tokenizer = GPT2TokenizerFast.from_pretrained("alakxender/dhivehi-gpt2-base")
# Prepare your prompt
prompt = "ދިވެހިރާއްޖެއަކީ"
inputs = tokenizer(prompt, return_tensors="pt")
# Generate text
outputs = model.generate(
**inputs,
max_length=200,
temperature=0.7,
top_p=0.9,
do_sample=True,
num_return_sequences=1
)
# Decode the generated text
generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(generated_text)
The model was trained using the following configuration:
This model is suitable for:
Not intended for:
Base model
openai-community/gpt2