MeowML/ToxicBERT - Turkish Toxic Language Detection
Model Description
ToxicBERT is a fine-tuned BERT model specifically designed for detecting toxic language in Turkish text. Built upon the dbmdz/bert-base-turkish-cased
foundation model, this classifier can identify potentially harmful, offensive, or toxic content in Turkish social media posts, comments, and general text.
Model Details
- Model Type: Text Classification (Binary)
- Language: Turkish (tr)
- Base Model:
dbmdz/bert-base-turkish-cased
- License: MIT
- Library: Transformers
- Task: Toxicity Detection
Intended Use
Primary Use Cases
- Content moderation for Turkish social media platforms
- Automated filtering of user-generated content
- Research in Turkish NLP and toxicity detection
- Educational purposes for understanding toxic language patterns
Out-of-Scope Use
- This model should not be used as the sole decision-maker for content moderation without human oversight
- Not suitable for languages other than Turkish
- Should not be used for sensitive applications without proper validation and testing
Training Data
The model was trained on the Overfit-GM/turkish-toxic-language
dataset, which contains Turkish text samples labeled for toxicity. The dataset includes various forms of toxic content commonly found in online Turkish communications.
Model Performance
The model outputs:
- Binary Classification: 0 (Non-toxic) or 1 (Toxic)
- Confidence Score: Probability score indicating model confidence
- Toxic Probability: Specific probability of the text being toxic
Usage
Quick Start
import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification
# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("dbmdz/bert-base-turkish-cased")
model = AutoModelForSequenceClassification.from_pretrained("MeowML/ToxicBERT")
# Prepare text
text = "Merhaba, nasılsın?"
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True, max_length=256)
# Get prediction
with torch.no_grad():
outputs = model(**inputs)
probabilities = torch.nn.functional.softmax(outputs.logits, dim=-1)
prediction = torch.argmax(probabilities, dim=-1)
toxic_probability = probabilities[0][1].item()
is_toxic = bool(prediction.item())
print(f"Is toxic: {is_toxic}")
print(f"Toxic probability: {toxic_probability:.4f}")
Advanced Usage with Custom Class
import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification
class ToxicLanguageDetector:
def __init__(self, model_name="MeowML/ToxicBERT"):
self.tokenizer = AutoTokenizer.from_pretrained("dbmdz/bert-base-turkish-cased")
self.model = AutoModelForSequenceClassification.from_pretrained(model_name)
self.device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
self.model.to(self.device)
self.model.eval()
def predict(self, text):
inputs = self.tokenizer(
text,
truncation=True,
padding='max_length',
max_length=256,
return_tensors='pt'
).to(self.device)
with torch.no_grad():
outputs = self.model(**inputs)
probabilities = torch.nn.functional.softmax(outputs.logits, dim=-1)
prediction = torch.argmax(probabilities, dim=-1)
return {
'text': text,
'is_toxic': bool(prediction.item()),
'toxic_probability': probabilities[0][1].item(),
'confidence': max(probabilities[0]).item()
}
# Usage
detector = ToxicLanguageDetector()
result = detector.predict("Merhaba, nasılsın?")
print(result)
Limitations and Biases
Limitations
- The model's performance depends heavily on the training data quality and coverage
- May have difficulty with context-dependent toxicity (sarcasm, irony)
- Performance may vary across different Turkish dialects or informal language
- Shorter texts might be more challenging to classify accurately
Potential Biases
- The model may reflect biases present in the training dataset
- Certain topics, demographics, or linguistic patterns might be over- or under-represented
- Regular evaluation and bias testing are recommended for production use
Ethical Considerations
- This model should be used responsibly with human oversight
- False positives and negatives are expected and should be accounted for
- Consider the impact on freedom of expression when implementing automated moderation
- Regular auditing and updating are recommended to maintain fairness
Technical Specifications
- Input: Text strings (max 256 tokens)
- Output: Binary classification with probability scores
- Model Size: Based on BERT-base architecture
- Inference Speed: Optimized for both CPU and GPU inference
- Memory Requirements: Suitable for standard hardware configurations
Citation
If you use this model in your research or applications, please cite:
@misc{meowml_toxicbert_2024,
title={ToxicBERT: Turkish Toxic Language Detection},
author={MeowML},
year={2024},
publisher={Hugging Face},
url={https://huggingface.co/MeowML/ToxicBERT}
}
Acknowledgments
- Base model:
dbmdz/bert-base-turkish-cased
- Training dataset:
Overfit-GM/turkish-toxic-language
- Built with Hugging Face Transformers library
Contact
For questions, issues, or suggestions, please open an issue in the model repository or contact the MeowML team.
Disclaimer: This model is provided for research and educational purposes. Users are responsible for ensuring appropriate and ethical use in their applications.
- Downloads last month
- 5
Model tree for MeowML/ToxicBERT
Base model
dbmdz/bert-base-turkish-cased