FoodEx2 Facet Descriptors Reranker

This is a CrossEncoder reranker model fine-tuned from distilroberta-base for the FoodEx2 domain. It is specifically optimized for re-ranking attribute-related candidate sentences based on relevance and precision for food description and classification tasks. The model was trained on the disi-unibo-nlp/foodex2-clean dataset, with additional negative examples from the disi-unibo-nlp/foodex2-terms dataset.

Model Details

Model Type: CrossEncoder Reranker
Base Model: distilroberta-base
Maximum Sequence Length: 256 tokens
Training Epochs: 10
Batch Size: 256 (train), 64 (evaluation)
Learning Rate: 2e-05
Warmup Steps: 100
Evaluation Steps: 1000
Optimizer: AdamW with Warmup Cosine scheduler
Precision: FP32 (no mixed precision)
Negative Sampling: 3 negatives per sample
Task Number: 3 (Attribute prediction)

This model employs a CrossEncoder architecture to compute similarity scores between candidate sentence pairs, enabling accurate differentiation among subtle variations in food-related attributes and terminologies.

Training Details

The model was trained using a custom training script with the following parameters:

Dataset: disi-unibo-nlp/foodex2-clean (positives), disi-unibo-nlp/foodex2-terms (negatives)
Validation Ratio: 10%
Evaluation on Test Set: Enabled
Loss Function: Pairwise ranking loss suitable for re-ranking tasks
Checkpoint Save Limit: 2 checkpoints
Max Gradient Norm: 1.0
Weight Decay: 0.01
Seed: 42
Random Negatives: Disabled (hard negatives used)
Maximum Evaluation Instances: 20

Training was performed in a GPU-accelerated environment for improved efficiency and reproducibility.

Evaluation

The model was evaluated on the FoodEx2 attribute task test set using standard retrieval and ranking metrics:

Metric	Value
Accuracy@1	0.9603
Accuracy@3	0.9958
Accuracy@5	1.0000
Accuracy@10	1.0000
Precision@1	0.9603
Recall@1	0.8472
Precision@3	0.4167
Recall@3	0.9859
Precision@5	0.2971
Recall@5	0.9974
Precision@10	0.2583
Recall@10	0.9996
MRR@10	0.9781
NDCG@10	0.9817
MAP@100	0.9736
Avg Seconds per Example	0.00139

Additionally, the model achieved a binary classification score of 0.9861, with the optimal threshold identified at 0.5454, yielding an F1 score of 0.3328.

Usage

To use the attribute-reranker model, follow these steps:

Install Dependencies:
```
pip install transformers torch
```

Load the Model:

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

tokenizer = AutoTokenizer.from_pretrained("your-username/attribute-reranker")
model = AutoModelForSequenceClassification.from_pretrained("your-username/attribute-reranker")

sentence_pair = ["Candidate attribute description", "Food product or context sentence"]

inputs = tokenizer(sentence_pair, return_tensors="pt", padding=True, truncation=True, max_length=256)
with torch.no_grad():
    outputs = model(**inputs)
    scores = outputs.logits
    print(scores)

Citation

If you use this model in your research, please cite the following work:

@article{sanh2019distilbert,
  title={DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter},
  author={Sanh, Victor and Debut, Lysandre and Chaumond, Julien and Wolf, Thomas},
  journal={arXiv preprint arXiv:1910.01108},
  year={2019}
}

disi-unibo-nlp
/

foodex-facet-descriptors-reranker

You need to agree to share your contact information to access this model

FoodEx2 Facet Descriptors Reranker

Model Details

Training Details

Evaluation

Usage

Citation

Model tree for disi-unibo-nlp/foodex-facet-descriptors-reranker

Dataset used to train disi-unibo-nlp/foodex-facet-descriptors-reranker

Collection including disi-unibo-nlp/foodex-facet-descriptors-reranker

🥗 FoodEx2 System