alakxender
/

dv-articles-sm-gpt2

Text Generation

text-generation-inference

Model card Files Files and versions

Dhivehi GPT-2

A GPT-2 language model trained on Dhivehi text data for text generation.

Model Details

Architecture: GPT-2
Vocab Size: 32,000 tokens
Context Length: 1024 tokens
Embedding Size: 768
Layers: 12
Attention Heads: 12
Total Parameters: ~124M

Training

The model is trained on Dhivehi text data with the following configuration:

Training Epochs: 3
Batch Size: 16 (4 per device with gradient accumulation of 4)
Learning Rate: 5e-4 with cosine decay
Weight Decay: 0.01
Warmup: 10% of training steps
Mixed Precision Training (FP16)
Early Stopping with patience of 3

Usage

def simple_generate(prompt, model_path):
    from grapp import DhivehiGPT2Generator
    generator = DhivehiGPT2Generator(model_path)
    return generator.generate_text(prompt, max_length=200)[0]

# Example usage:
result = simple_generate("ސުރުޚީ: ރާއްޖޭގެ", "alakxender/dv-articles-sm-gpt2")
print(result)

Downloads last month: 5

Safetensors

Model size

13M params

Tensor type

F16

·

Inference Providers NEW

Text Generation

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for alakxender/dv-articles-sm-gpt2

Base model

openai-community/gpt2

Finetuned

(1890)

this model

Collection including alakxender/dv-articles-sm-gpt2

NLP

Dhivehi Natural Language Processing: Text analysis, translation, sentiment analysis, and language generation tools for Thaana • 27 items • Updated 3 days ago