xlm-roberta-email-classifier

Fine-tuned version of xlm-roberta-base for multi-class classification of English-language emails.
This model is designed to automatically route or tag incoming messages based on their content.

Model Overview

  • Base Model: xlm-roberta-base
  • Task: Email classification (10 categories)
  • Language: English
  • Frameworks: Hugging Face Transformers, PyTorch Lightning
  • Training Tracker: Weights & Biases

Performance

  • Accuracy: 0.42
  • F1 Score: 0.436
  • Precision: 0.527
  • Recall: 0.42

Class Labels

The model predicts one of the following categories:

Label ID Category
0 Billing and Payments
1 Customer Service
2 General Inquiry
3 Human Resources
4 IT Support
5 Product Support
6 Returns and Exchanges
7 Sales and Pre-Sales
8 Service Outages and Maintenance
9 Technical Support

Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("ale-dp/xlm-roberta-email-classifier")
model = AutoModelForSequenceClassification.from_pretrained("ale-dp/xlm-roberta-email-classifier")

email_text = "I'd like to return the item I purchased last week."
inputs = tokenizer(email_text, return_tensors="pt")
outputs = model(**inputs)

predicted_class_id = outputs.logits.argmax().item()
label_map = {
    'Billing and Payments': 0,
    'Customer Service': 1,
    'General Inquiry': 2,
    'Human Resources': 3,
    'IT Support': 4,
    'Product Support': 5,
    'Returns and Exchanges': 6,
    'Sales and Pre-Sales': 7,
    'Service Outages and Maintenance': 8,
    'Technical Support': 9
}
predicted_label = list(label_map.keys())[list(label_map.values()).index(predicted_class_id)]
print(predicted_label)
Downloads last month
57
Safetensors
Model size
278M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train ale-dp/xlm-roberta-email-classifier

Space using ale-dp/xlm-roberta-email-classifier 1