BERT NMB+ (Disinformation Sequence Classification):

Classifies sentences as "Likely" or "Unlikely" biased/disinformation (max token len 128).

Fine-tuned BERT (bert-base-uncased) on the headline and text_label fields in the News Media Bias Plus Dataset.

This model was trained without weighted sampling, and the dataset contains 81.9% 'Likely' and 18.1% 'Unlikely' examples. The same model trained with weighted sampling preformed better when evaluated by gpt-4o-mini as a judge and is available here.

Metics

Evaluated on a 0.1 random sample of the NMB+ dataset, unseen during training

  • Accuracy: 0.7990
  • Precision: 0.8096
  • Recall: 0.9556
  • F1 Score: 0.8766

How to Use:

from transformers import pipeline

classifier = pipeline("text-classification", model="maximuspowers/nmbp-bert-headlines")
result = classifier("He was a terrible politician.", top_k=2)

Example Response:

[
  {
    'label': 'Likely',
    'score': 0.9967995882034302
  },
  {
    'label': 'Unlikely',
    'score': 0.003200419945642352
  }
]
Downloads last month
24
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for maximuspowers/nmbp-bert-headlines

Finetuned
(2310)
this model

Dataset used to train maximuspowers/nmbp-bert-headlines