metadata

language: en
datasets:
  - jigsaw-toxic-comment-classification-challenge
tags:
  - text-classification
  - multi-label-classification
  - toxicity-detection
  - bert
  - transformers
  - pytorch
license: apache-2.0
model-index:
  - name: BERT Multi-label Toxic Comment Classifier
    results:
      - task:
          name: Multi-label Text Classification
          type: multi-label-classification
        dataset:
          name: Jigsaw Toxic Comment Classification Challenge
          type: jigsaw-toxic-comment-classification-challenge
        metrics:
          - name: Accuracy
            type: accuracy
            value: 0.9187

BERT Multi-label Toxic Comment Classifier

This model is a fine-tuned bert-base-uncased transformer for multi-label classification on the Jigsaw Toxic Comment Classification Challenge dataset.

It predicts multiple toxicity-related labels per comment, including:

toxicity
severe toxicity
obscene
threat
insult
identity attack
sexual explicit

Model Details

Base Model: bert-base-uncased
Task: Multi-label text classification
Dataset: Jigsaw Toxic Comment Classification Challenge (processed version)
Labels: 7 toxicity-related categories
Training Epochs: 2
Batch Size: 16 (train), 64 (eval)
Metrics: Accuracy, Macro F1, Precision, Recall

Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("Koushim/bert-multilabel-jigsaw-toxic-classifier")
model = AutoModelForSequenceClassification.from_pretrained("Koushim/bert-multilabel-jigsaw-toxic-classifier")

text = "You are a wonderful person!"
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True, max_length=128)
outputs = model(**inputs)

# Sigmoid to get probabilities for each label
import torch
probs = torch.sigmoid(outputs.logits)
print(probs)

Labels

Index	Label
0	toxicity
1	severe_toxicity
2	obscene
3	threat
4	insult
5	identity_attack
6	sexual_explicit

Training Details

Training Set: Full dataset (160k+ samples)
Loss Function: Binary Cross Entropy (via BertForSequenceClassification with problem_type="multi_label_classification")
Optimizer: AdamW
Learning Rate: 2e-5
Evaluation Strategy: Epoch-based evaluation with early stopping on F1 score
Model Framework: PyTorch with Hugging Face Transformers

Repository Contents

pytorch_model.bin - trained model weights
config.json - model configuration
tokenizer.json, vocab.txt - tokenizer files
README.md - this file

How to Fine-tune or Train

You can fine-tune this model using the Hugging Face Trainer API with your own dataset or the original Jigsaw dataset.

Citation

If you use this model in your research or project, please cite:

@article{devlin2019bert,
  title={BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding},
  author={Devlin, Jacob and Chang, Ming-Wei and Lee, Kenton and Toutanova, Kristina},
  journal={arXiv preprint arXiv:1810.04805},
  year={2019}
}

License

Apache 2.0 License