Somali Agriculture GPT-2 Model

This is a GPT-2 language model trained on ~10,000 Somali prompt-response pairs related to agriculture. The goal is to help generate Somali text for agricultural questions and educational content.

Model Details

  • Architecture: GPT-2 (4 layers, 4 attention heads)
  • Vocabulary: Custom trained Byte-Pair Encoding tokenizer
  • Training Data: 10,000 prompt-response pairs in Somali language
  • Epochs: 5
  • Embedding Size: 256
  • Context Length: 512 tokens

Intended Use

This model can be used to:

  • Generate Somali answers to agricultural questions
  • Create educational materials for Somali farmers
  • Build Somali chatbots focused on agriculture

Example Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("tacab/somali-agriculture")
tokenizer = AutoTokenizer.from_pretrained("tacab/somali-agriculture")

prompt = "Maxay tahay faa'iidada bacriminta?"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Downloads last month
0
Safetensors
Model size
7.14M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Spaces using tacab/somali-agriculture 2