metadata

library_name: transformers
tags:
  - legal
datasets:
  - ealvaradob/phishing-dataset
language:
  - en
metrics:
  - accuracy
  - precision
  - recall
  - f1
base_model:
  - distilbert/distilbert-base-uncased

📧 distilbert-finetuned-phishing

A fine-tuned distilbert-base-uncased model for phishing email classification. This model is designed to distinguish between safe and phishing emails using natural language content.

Colab Notebook

🧪 Evaluation Results

The model was trained on 77,677 emails and evaluated with the following results:

Metric	Value
Accuracy	0.9639
Precision	0.9648
Recall	0.9489
F1 Score	0.9568
Eval Loss	0.1326

⚙️ Training Configuration

TrainingArguments( output_dir="./hf-phishing-model", evaluation_strategy="epoch", save_strategy="epoch", learning_rate=2e-5, per_device_train_batch_size=16, per_device_eval_batch_size=64, num_train_epochs=3, weight_decay=0.01, logging_dir="./logs", load_best_model_at_end=True, fp16=torch.cuda.is_available(), )