๐Ÿง  Kurdish NER with XLM-R

This is a fine-tuned xlm-roberta-base model for Named Entity Recognition (NER) in Kurmanji Kurdish. It was trained on a manually annotated dataset of over 8,000 sentences. The model identifies the following entity types:

  • PER: Person
  • LOC: Location
  • ORG: Organization

๐Ÿค— Model Details

  • Base model: xlm-roberta-base (270 M parameters)
  • Fine-tuning
    • Epochs: 5
    • Batch size: 16
    • Max seq length: 128 tokens
    • Optimizer: AdamW
    • Learning rate: 2e-5
    • Warmup steps: 500
    • Weight decay: 0.01

๐Ÿ” Intended Use

  • Extract named entities from Kurmanji Kurdish text (news, social media, etc.)
  • Aid in information extraction, digital humanities, and low-resource language research

๐Ÿงช Evaluation Metrics

Test set: 1,630 sentences (โ‰ˆ26 k tokens)

Entity Precision Recall F1 Score
PER 0.8719 0.8666 0.8692
LOC 0.8817 0.8825 0.8821
ORG 0.7280 0.7930 0.7591
Overall 0.8325 0.8511 0.8414

๐ŸŒ Try it Online

๐Ÿ‘‰ Streamlit Demo
Paste a sentence in Kurmanji Kurdish (Latin script) and explore the modelโ€™s predictions in your browser.


๐Ÿ› ๏ธ How to Use

You can also load and use the model via Hugging Face ๐Ÿค— Transformers:

from transformers import AutoTokenizer, AutoModelForTokenClassification, pipeline

# Load model and tokenizer
model_id = "akam-ot/ku-ner-xlmr"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForTokenClassification.from_pretrained(model_id)

# Create NER pipeline
ner = pipeline("ner", model=model, tokenizer=tokenizer, aggregation_strategy="simple")

# Example sentence
sentence = "Navรช min Hejar e รป ez li Hewlรชr dijรฎm."

# Run NER
results = ner(sentence)

# Display results
for ent in results:
    print(f"{ent['word']} โ†’ {ent['entity_group']} (score: {ent['score']:.2f})")
Downloads last month
77
Safetensors
Model size
277M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for akam-ot/ku-ner-xlmr

Finetuned
(3178)
this model