BERT NMB+ (Disinformation Sequence Classification):
Classifies sentences as "Likely" or "Unlikely" biased/disinformation (max token len 128).
Fine-tuned BERT (bert-base-uncased) on the headline
and text_label
fields in the News Media Bias Plus Dataset.
This model was trained with weighted sampling so that each batch contains 50% 'Likely' examples and 50% 'Unlikely' examples. The same model trained without weighted sampling is here, and got slightly better eval metrics. However, this model preformed better when predictions were evaluated by gpt-4o as a judge.
Metics
Evaluated on a 0.1 random sample of the NMB+ dataset, unseen during training
- Accuracy: 0.6745
- Precision: 0.9070
- Recall: 0.6288
- F1 Score: 0.7427
How to Use:
from transformers import pipeline
classifier = pipeline("text-classification", model="maximuspowers/nmbp-bert-headlines-balanced")
result = classifier("He was a terrible politician.", top_k=2)
Example Response:
[
{
'label': 'Likely',
'score': 0.9967995882034302
},
{
'label': 'Unlikely',
'score': 0.003200419945642352
}
]
- Downloads last month
- 20
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for maximuspowers/nmbp-bert-headlines-balanced
Base model
google-bert/bert-base-uncased