Model Card for Toxicity Detection Model

This model is a fine-tuned version of bert-base-uncased for toxicity detection in Turkish text. It has been trained on labeled datasets containing online comments categorized by their toxicity levels. The model uses the Hugging Face transformers library and is suitable for sequence classification tasks. This work was completed as a project assignment for the Natural Language Processing (CENG493) course at Çankaya University.

  • Model Type: Sequence Classification
  • Language(s): Turkish
  • License: GNU GENERAL PUBLIC LICENSE
  • Fine-tuned from: dbmdz/bert-base-turkish-cased

Uses

This model can be used directly to analyze the toxicity of text in Turkish. For example:

  • Content moderation in online forums and social media platforms
  • Filtering harmful language in customer reviews or feedback
  • Monitoring and preventing cyberbullying in messaging applications

Downstream Use

  • Integrating toxic language filtering into chatbots or virtual assistants
  • Using it as part of a sentiment analysis pipeline

Out-of-Scope Use

  • Not suitable for analyzing languages other than Turkish
  • Should not be used for sensitive decision-making without human oversight

Bias, Risks, and Limitations

The model may inherit biases from the training data, including overrepresentation or underrepresentation of certain demographics or topics. It may also misclassify non-toxic content as toxic or fail to detect subtler forms of toxicity.

Recommendations

Users should:

  • Avoid deploying the model in high-stakes scenarios without additional validation.
  • Regularly monitor performance and update the model if new biases are detected.

Training Data

https://huggingface.co/datasets/Overfit-GM/turkish-toxic-language

Evaluation

The model was evaluated on a held-out test set containing a balanced mix of toxic and non-toxic examples.

Downloads last month
2,991
Safetensors
Model size
111M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for fc63/toxic-classification-model

Finetuned
(185)
this model

Dataset used to train fc63/toxic-classification-model