πŸ“§ Model Card for aamoshdahal/email-phishing-distilbert-finetuned

This model is a fine-tuned version of DistilBERT (distilbert-base-uncased) trained specifically for phishing email detection. It classifies email content into two categories: phishing and legitimate.

The model was trained using a Phishing Email Dataset and evaluated against the cybersectony/PhishingEmailDetectionv2.0 dataset.

It is optimized for:

  • High recall to catch most phishing attempts
  • High precision to reduce false positives
  • Fast inference via the lightweight DistilBERT architecture
  • Interpretability, with support for token-level explanations using transformers-interpret

This model is ideal for security tools, email scanning systems, awareness training platforms, and research on adversarial phishing attacks.

Model Details

Model Description

This is a fine-tuned DistilBERT model trained to classify email content as either phishing or legitimate. It was developed as part of a cybersecurity research project to detect phishing attempts in email messages using finetuned transformer model.

  • Developed by: @aamoshdahal
  • Model type: DistilBERT (Transformer-based sequence classifier)
  • Language(s): English
  • Finetuned from model: distilbert-base-uncased

Intended Uses & Users

This model is intended to be used as a lightweight and reliable phishing email detector. It can be integrated into:

  • Email clients or gateways to filter phishing emails in real time
  • Security software or firewalls as an additional phishing classifier
  • Educational tools for training users to recognize phishing attempts
  • Research environments to study adversarial or evolving phishing tactics

Foreseeable Users:

  • Cybersecurity professionals
  • Software developers integrating NLP into email platforms
  • Researchers working on phishing detection

Foreseeable Impact:

  • Improved early detection of phishing attacks
  • Reduced exposure to credential theft and fraud
  • Increased public understanding of phishing strategies

πŸš€ How to Get Started with the Model

You can use the code snippet below to quickly load the fine-tuned model and make predictions on any email content:

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
from transformers_interpret import SequenceClassificationExplainer

# Load the model and tokenizer from Hugging Face Hub
model_id = "aamoshdahal/email-phishing-distilbert-finetuned"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSequenceClassification.from_pretrained(model_id)

# Set device (GPU if available)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)
model.eval()

# Example email for prediction
email = \"\"\"Dear user,

We detected suspicious activity on your account. Please verify your identity immediately by clicking the link below to avoid suspension.

[Phishing Link Here]

Thank you,
Security Team\"\"\"

# Tokenize and prepare the input
encoded_input = tokenizer(email, return_tensors='pt', truncation=True, padding=True).to(device)

# Make prediction
with torch.no_grad():
    outputs = model(**encoded_input)
    probs = torch.nn.functional.softmax(outputs.logits, dim=1)

# Output prediction
labels = ["legitimate", "phishing"]
pred_label = labels[probs.argmax()]
confidence = probs.max().item()

print(f"Prediction: {pred_label} ({confidence:.2%} confidence)")

explainer = SequenceClassificationExplainer(model=model, tokenizer=tokenizer)
word_attributions = explainer(email, class_name="LABEL_0")
explainer.visualize()

πŸ‹οΈβ€β™‚οΈ Training Details

πŸ“¦ Training Data

The model was fine-tuned on a balanced phishing email dataset compiled from multiple public sources, including:

  • Enron Email Dataset
  • CEAS 2008 Phishing Corpus
  • Ling-Spam Dataset
  • SpamAssassin
  • Nazario Phishing Emails
  • Nigerian Fraud Email Dataset

These were aggregated and preprocessed via the Phishing Email Dataset on Kaggle. Each data entry includes a combined text_combined field, which concatenates the subject line, body text, sender address, and timestamp to provide full context for classification.


βš™οΈ Training Procedure

This model was fine-tuned using the Hugging Face πŸ€— Trainer API with the following configuration:

  • Base model: distilbert-base-uncased
  • Architecture: Transformer-based sequence classifier (DistilBertForSequenceClassification)
  • Epochs: 3
  • Batch size: 16
  • Learning rate: 2e-5
  • Weight decay: 0.01
  • Evaluation strategy: Per epoch
  • Monitoring: All metrics logged via Weights & Biases (W&B)

The model was trained using a Tesla A100 GPU (40GB VRAM) on Google Colab Pro.

Preprocessing

  • Duplicate and null record removal
  • Lowercasing and whitespace cleanup
  • Tokenization using DistilBertTokenizer
  • Label encoding (0 = legitimate, 1 = phishing)
  • Random Undersampling to ensure class balance

πŸ“Š Evaluation Results

For updated results and runs check this public wandb project. Full Report

The fine-tuned DistilBERT model was evaluated on a test dataset containing both phishing and legitimate emails. Below is a summary of its performance compared to baseline models (raw DistilBERT and raw BERT):

πŸ“ˆ Fine-Tuned DistilBERT (Best Performing)

Epoch Training Loss Validation Loss Accuracy Precision Recall F1 Score ROC AUC
1 0.0323 0.0243 0.9936 0.9916 0.9961 0.9939 0.9996
2 0.0083 0.0297 0.9938 0.9968 0.9912 0.9940 0.9998
3 0.0044 0.0275 0.9951 0.9959 0.9947 0.9953 0.9997
  • Test Set Summary:
    • Accuracy: 96.62%
    • Precision: 95.90%
    • Recall: 97.46%
    • F1 Score: 96.67%
    • ROC AUC: 0.9953

⚠️ Raw DistilBERT (Untrained)

  • Accuracy: 49.57%
  • Precision: 0.00%
  • Recall: 0.00%
  • F1 Score: 0.00
  • ROC AUC: 0.5694

⚠️ Raw BERT (Untrained)

  • Accuracy: 49.57%
  • Precision: 0.00%
  • Recall: 0.00%
  • F1 Score: 0.00
  • ROC AUC: 0.4984

Downloads last month
61
Safetensors
Model size
67M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for aamoshdahal/email-phishing-distilbert-finetuned

Finetuned
(8098)
this model