XLM-RoBERTa Ticket Classifier

A multilingual email/ticket classifier fine-tuned from xlm-roberta-base to categorize customer support tickets in English and German. It predicts both routing category and issue type, helping automate ticket triage, intent detection, and prioritization in multilingual helpdesk environments.

Model Details

  • Base model: xlm-roberta-base
  • Languages: English πŸ‡¬πŸ‡§ & German πŸ‡©πŸ‡ͺ
  • Task: Multi-class text classification
  • Training data: german-english-email-ticket-classification
  • Tokenizer: SentencePiece BPE tokenizer
  • Framework: πŸ€— Transformers

Classification Schema

This model performs multi-head classification, predicting both:

🎯 Queue (Routing Category)

  • Billing and Payments
  • Customer Service
  • General Inquiry
  • Human Resources
  • IT Support
  • Product Support
  • Returns and Exchanges
  • Sales and Pre-Sales
  • Service Outages and Maintenance
  • Technical Support

πŸ› οΈ Type (Issue Nature)

  • Incident
  • Request
  • Problem
  • Change

πŸ“ˆ Model Performance Summary

Metric Value
Accuracy (Type) 85.73%
Accuracy (Queue) 51.89%
F1 Score (Type) 85.73%
F1 Score (Queue) 52.09%

This model demonstrates strong performance on type classification, while queue prediction reflects the inherent complexity of routing logic across overlapping categories.

πŸ” More detailed metrics, visualizations, and training curves available on the W&B dashboard

Intended Uses

  • Classify incoming tickets into predefined categories
  • Automate support ticket routing
  • Detect customer intent in multilingual environments
  • Integrate with helpdesk platforms like Zendesk or Freshdesk

πŸš€ Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline

model_id = "ale-dp/xlm-roberta-ticket-classifier"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSequenceClassification.from_pretrained(model_id)

classifier = pipeline("text-classification", model=model, tokenizer=tokenizer)

text = "Hallo, Die Data-Analytics-Plattform funktioniert nicht richtig und es werden unkorrekte Investment-Analyse-Fehlermeldungen generiert. Dies kΓΆnnte auf einen Software-Fehler hindeuten."

result = classifier(text)
print(result)

Created by:

α΄€ΚŸΙͺ α΄‹Κœα΄€ΚŸα΄€α΄ŠΙͺ

Citation

If you use this model, please cite:

@misc{xlm-roberta-ticket-classifier,
  author = {Ali Khalaji},
  title = {XLM-RoBERTa Ticket Classifier},
  year = {2025},
  url = {https://huggingface.co/ale-dp/xlm-roberta-ticket-classifier}
}
Downloads last month
46
Safetensors
Model size
278M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Evaluation results

  • Accuracy (Type) on german-english-email-ticket-classification
    self-reported
    0.857
  • Accuracy (Queue) on german-english-email-ticket-classification
    self-reported
    0.519
  • F1 Score (Type) on german-english-email-ticket-classification
    self-reported
    0.857
  • F1 Score (Queue) on german-english-email-ticket-classification
    self-reported
    0.521