Text Classification
Transformers
Safetensors
roberta
cross-encoder

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

FoodEx2 Facet Descriptors Reranker

This is a CrossEncoder reranker model fine-tuned from distilroberta-base for the FoodEx2 domain. It is specifically optimized for re-ranking attribute-related candidate sentences based on relevance and precision for food description and classification tasks. The model was trained on the disi-unibo-nlp/foodex2-clean dataset, with additional negative examples from the disi-unibo-nlp/foodex2-terms dataset.

Model Details

  • Model Type: CrossEncoder Reranker
  • Base Model: distilroberta-base
  • Maximum Sequence Length: 256 tokens
  • Training Epochs: 10
  • Batch Size: 256 (train), 64 (evaluation)
  • Learning Rate: 2e-05
  • Warmup Steps: 100
  • Evaluation Steps: 1000
  • Optimizer: AdamW with Warmup Cosine scheduler
  • Precision: FP32 (no mixed precision)
  • Negative Sampling: 3 negatives per sample
  • Task Number: 3 (Attribute prediction)

This model employs a CrossEncoder architecture to compute similarity scores between candidate sentence pairs, enabling accurate differentiation among subtle variations in food-related attributes and terminologies.

Training Details

The model was trained using a custom training script with the following parameters:

  • Dataset: disi-unibo-nlp/foodex2-clean (positives), disi-unibo-nlp/foodex2-terms (negatives)
  • Validation Ratio: 10%
  • Evaluation on Test Set: Enabled
  • Loss Function: Pairwise ranking loss suitable for re-ranking tasks
  • Checkpoint Save Limit: 2 checkpoints
  • Max Gradient Norm: 1.0
  • Weight Decay: 0.01
  • Seed: 42
  • Random Negatives: Disabled (hard negatives used)
  • Maximum Evaluation Instances: 20

Training was performed in a GPU-accelerated environment for improved efficiency and reproducibility.

Evaluation

The model was evaluated on the FoodEx2 attribute task test set using standard retrieval and ranking metrics:

Metric Value
Accuracy@1 0.9603
Accuracy@3 0.9958
Accuracy@5 1.0000
Accuracy@10 1.0000
Precision@1 0.9603
Recall@1 0.8472
Precision@3 0.4167
Recall@3 0.9859
Precision@5 0.2971
Recall@5 0.9974
Precision@10 0.2583
Recall@10 0.9996
MRR@10 0.9781
NDCG@10 0.9817
MAP@100 0.9736
Avg Seconds per Example 0.00139

Additionally, the model achieved a binary classification score of 0.9861, with the optimal threshold identified at 0.5454, yielding an F1 score of 0.3328.

Usage

To use the attribute-reranker model, follow these steps:

  1. Install Dependencies:

    pip install transformers torch
    
  2. Load the Model:

    from transformers import AutoTokenizer, AutoModelForSequenceClassification
    import torch
    
    tokenizer = AutoTokenizer.from_pretrained("your-username/attribute-reranker")
    model = AutoModelForSequenceClassification.from_pretrained("your-username/attribute-reranker")
    
    sentence_pair = ["Candidate attribute description", "Food product or context sentence"]
    
    inputs = tokenizer(sentence_pair, return_tensors="pt", padding=True, truncation=True, max_length=256)
    with torch.no_grad():
        outputs = model(**inputs)
        scores = outputs.logits
        print(scores)
    

Citation

If you use this model in your research, please cite the following work:

@article{sanh2019distilbert,
  title={DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter},
  author={Sanh, Victor and Debut, Lysandre and Chaumond, Julien and Wolf, Thomas},
  journal={arXiv preprint arXiv:1910.01108},
  year={2019}
}
Downloads last month
-
Safetensors
Model size
82.1M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for disi-unibo-nlp/foodex-facet-descriptors-reranker

Finetuned
(642)
this model

Dataset used to train disi-unibo-nlp/foodex-facet-descriptors-reranker

Collection including disi-unibo-nlp/foodex-facet-descriptors-reranker