atasoglu/turkish-base-bert-uncased-offenseval2020_tr
This is an offensive language detection model fine-tuned with coltekin/offenseval2020_tr dataset on ytu-ce-cosmos/turkish-base-bert-uncased.
Usage
Quick usage:
from transformers import pipeline
pipe = pipeline("text-classification", "atasoglu/turkish-base-bert-uncased-offenseval2020_tr")
print(pipe("bu bir test metnidir.", top_k=None))
# [{'label': 'NOT', 'score': 0.9970345497131348}, {'label': 'OFF', 'score': 0.0029654440004378557}]
Or:
import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model_id = "atasoglu/turkish-base-bert-uncased-offenseval2020_tr"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSequenceClassification.from_pretrained(model_id).to(device)
@torch.no_grad
def predict(X):
inputs = tokenizer(X, padding="max_length", truncation=True, max_length=256, return_tensors="pt")
outputs = model.forward(**inputs.to(device))
return torch.argmax(outputs.logits, dim=-1).tolist()
print(predict(["bu bir test metnidir."]))
# [0]
Test Results
Test results examined on the test split of fine-tuning dataset.
precision | recall | f1-score | support | |
---|---|---|---|---|
NOT | 0.9162 | 0.9559 | 0.9356 | 2812 |
OFF | 0.7912 | 0.6564 | 0.7176 | 716 |
accuracy | 0.8951 | 3528 | ||
macro avg | 0.8537 | 0.8062 | 0.8266 | 3528 |
weighted avg | 0.8908 | 0.8951 | 0.8914 | 3528 |
- Downloads last month
- 118
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.