xlm-roberta-base-lora-amharic-news-classification
This repo contains LoRA adapters for the xlm-roberta-base model finetuned on the Amharic-News-Text-classification-Dataset.
The finetuned model classifies an Amharic news article into one of the following 6 categories.
- ሀገር አቀፍ ዜና (Local News)
- መዝናኛ (Entertainment)
- ስፖርት (Sports)
- ቢዝነስ (Business)
- ዓለም አቀፍ ዜና (International News)
- ፖለቲካ (Politics)
It achieves the following results on the evaluation set:
- Train Loss: 0.3563
- Validation Loss: 0.3613
- Validation Accuracy: 0.8642
- Validation F1 Score (macro): 0.8220
- Validation F1 Score (weighted): 0.8648
How to use
You can use this model with a pipeline for text classification.
But first, you need to install the peft
library like so:
pip install peft
Then, you can run the following code.
from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline
model_id = "xlm-roberta-base"
peft_model_id = "rasyosef/xlm-roberta-base-lora-amharic-news-classification"
categories = ['ሀገር አቀፍ ዜና', 'መዝናኛ', 'ስፖርት', 'ቢዝነስ', 'ዓለም አቀፍ ዜና', 'ፖለቲካ']
id2label = {i: lbl for i, lbl in enumerate(categories)}
label2id = {lbl: i for i, lbl in enumerate(categories)}
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSequenceClassification.from_pretrained(
model_id,
num_labels=len(categories), # 6
id2label=id2label,
label2id=label2id
)
model.load_adapter(peft_model_id)
classifier = pipeline("text-classification", model=model, tokenizer=tokenizer)
classifier([
"""ቅርሶቹን ለመታደግ የተጀመረው የሙዚዬም ግንባታም በበጀት ምክንያት ተቋርጧል።
በአፄ ቴዎድሮስ የንግስና ቦታ ደረስጌ ማሪያም ተጀምሮ የቆመው የሙዚየሙ ግንባታ ተጠናቀቆ ስራ
እንዲጀምር ነዋሪዎች ጠይቀዋል። ዘመነ መሳፍንት መቋጫ ያገኘባት የኢትዮጵያ አንድነት የታወጀባት ዳግማዊ
አፄ ቴዎድሮስ ከመንገሳቸው በፊት ደጃች ውቤን ቧሂት ከሚባል ቦታ ድል አድርገው ደጃች ውቤ ለንግስና ባዘጋጁት የንግስና ቦታና
እቃዎች ንጉሰ ነገስት ዘኢትዮጵያ ተብለው የነገሱባት ቦታ ናት።""", # 'ሀገር አቀፍ ዜና'
])
Output:
[{'label': 'ሀገር አቀፍ ዜና', 'score': 0.977573037147522}]
Demo
Use the following demo to play around with the model: https://huggingface.co/spaces/rasyosef/amharic-news-classification
Framework versions
- PEFT 0.7.1
- Downloads last month
- 38
Inference API (serverless) does not yet support peft models for this pipeline type.
Model tree for rasyosef/xlm-roberta-base-lora-amharic-news-classification
Base model
FacebookAI/xlm-roberta-base