Model Card for p60330av-evidence-detection

This is a binary classification model that was trained to detect whether a piece of evidence is relevant to a given claim or not.

Model Details

Model Description

This model is a fine-tuned version of roberta-base for the task of evidence detection. Given a claim and a piece of evidence, it predicts whether the evidence supports the claim (1) or does not support the claim (0).

  • Developed by: Ambar Vishnoi, Matthew O'Farrelly
  • Model type: Classifier using deep learning approaches underpinned by transformer architecture
  • Model Architecture Transformers
  • Language(s) (NLP): English
  • Finetuned from model: roberta-base

Model Resources

Repository https://huggingface.co/FacebookAI/roberta-base

How to Get Started with the Model

MODEL_NAME = "ambarvish/ED_BERT_model"
tokenizer = RobertaTokenizer.from_pretrained(MODEL_NAME)
model = RobertaForSequenceClassification.from_pretrained(MODEL_NAME)

For further use see the demo: ev-detect-BERT-demo.ipynb

Training Details

Training Data

~20k pairs of claims and evidences

Validation/Development data set: ~6k pairs

Training Procedure

Performed a grid search with the grid: learning_rates = [2e-5, 3e-5, 5e-5], batch_sizes = [8, 16, 32], epochs = [3, 4, 5] on the development training set using 20% of the data for validation.

These hyperparamaters were then fixed and the model re-trained on the larger training dataset.

Training Hyperparamaters

Optimizer: AdamW

Loss Function: FocalLoss (CrossEntropy with focusing and balancing paramaters - see [1])

Learning Rate: Grid-searched, optimal value found to be 2e-5

Batch Size: Grid-searched, optimal value found to be 32

Epochs: Grid-searched, optimal value found to be 4

Hardware Used: Trained on GPU (Google Colab environment)

Speeds, Sizes, Times

  • overall grid search time: ~1 hour
  • overall training time: ~1 hour
  • duration per training epoch: ~15 mins
  • model size: 499MB

Evaluation

Testing Data, Factors & Metrics

Testing Data

Validation dataset consisting of 6k pairs.

Metrics and results

The following metrics were used to provide a quantitative evaluation of the model

Accuracy: 0.8815

Weighted Precision: 0.8859

Weighted Recall: 0.8815

Weighted F1-Score: 0.8831

Confusion Matrix: Included in evaluation script: ev-detect-BERT.ipynb

MCC: 0.7136

Techinical Specifications

Bias, Risk and Limitations

  • model may not generalize well to claims outside of its training distribution.
  • model limited to 512 tokens and any inputs longer will be truncated

Citations

[1]: Lin, T.Y., Goyal, P., Girshick, R., He, K. and Dollár, P., 2017. Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision (pp. 2980-2988).

Downloads last month
2
Safetensors
Model size
125M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ambarvish/ED_BERT_model

Finetuned
(3290)
this model