distilBART-MNLI for ZeroShot-Text-Classification fine tuned on cnn news article

This is a Huggingface model fine-tuned on the CNN news dataset for zero-shot text classification task using DistilBART-MNLI. The model achieved an f1 score of 93% and an accuracy of 93% on the CNN test dataset with a maximum length of 128 tokens.

Authors

This work was done by CHERGUELAINE Ayoub & BOUBEKRI Faycal

Original Model

valhalla/distilbart-mnli-12-1

Model Architecture

The model architecture is based on the DistilBART-MNLI transformer model. DistilBART is a smaller and faster version of BART that is pre-trained on a large corpus of text and fine-tuned on downstream natural language processing tasks.

Dataset

The CNN news dataset was used for fine-tuning the model. This dataset contains news articles from the CNN website and is labeled into 6 categories, including politics, health, entertainment, tech, travel, world, and sports.

Fine-tuning Parameters

The model was fine-tuned for 1 epoch on a maximum length of 256 tokens. The training took approximately 6 hours to complete.

Evaluation Metrics

The model achieved an f1 score of 93% and an accuracy of 93% on the CNN test dataset with a maximum length of 128 tokens.

Usage

The model can be used for zero-shot text classification tasks on news articles. It can be accessed via the Huggingface Transformers library using the following code:

from transformers import pipeline, AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("AyoubChLin/DistilBart_cnn_zeroShot")

model = AutoModelForSequenceClassification.from_pretrained("AyoubChLin/DistilBart_cnn_zeroShot")
classifier = pipeline(
    "zero-shot-classification",
    model=model,
    tokenizer=tokenizer,
    device=0
)

Acknowledgments

We would like to acknowledge the Huggingface team for their open-source implementation of transformer models and the CNN news dataset for providing the labeled dataset for fine-tuning.

Downloads last month
27
Safetensors
Model size
307M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train AyoubChLin/distilBART-mnli-cnn_news