YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Email Processing ModernBERT Model

Fine-tuned ModernBERT model for email processing tasks.

Model Capabilities

This model can compute semantic similarity between questions and answers related to:

  • Email addresses
  • Subject lines

Recommended Thresholds

Based on extensive testing, the following thresholds are recommended:

  • For email questions: 0.85
  • For subject questions: 0.70
  • For other questions: 0.80

Additional content-aware checks are recommended for best results.

Usage

from sentence_transformers import SentenceTransformer
import torch

# Load the model
model = SentenceTransformer('sugiv/email-processing-modernbert')

# Encode questions and answers
q_embed = model.encode("What's your email address?", convert_to_tensor=True)
a1_embed = model.encode("My email is [email protected]", convert_to_tensor=True)
a2_embed = model.encode("The weather is nice today", convert_to_tensor=True)

# Calculate similarity
similarity1 = torch.nn.functional.cosine_similarity(q_embed.unsqueeze(0), a1_embed.unsqueeze(0)).item()
similarity2 = torch.nn.functional.cosine_similarity(q_embed.unsqueeze(0), a2_embed.unsqueeze(0)).item()

print(f'Similarity with relevant answer: {similarity1:.4f}')
print(f'Similarity with irrelevant answer: {similarity2:.4f}')

# Apply threshold
threshold = 0.85  # For email questions
print(f'Is relevant: {similarity1 >= threshold}')
print(f'Is irrelevant: {similarity2 < threshold}')

Training Information

  • Base model: answerdotai/ModernBERT-base
  • Published date: 2025-04-24
  • Training approach: Fine-tuned with balanced dataset of email and subject questions
  • Framework: sentence-transformers with PyTorch
Downloads last month
9
Safetensors
Model size
149M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support