πŸ“° DistilBERT Fine-Tuned on AG News with and without Label Smoothing

This repository provides two fine-tuned DistilBERT models for topic classification on the AG News dataset:

  • βœ… model_no_smoothing: Fine-tuned without label smoothing
  • πŸ§ͺ model_label_smoothing: Fine-tuned with label smoothing (smoothing=0.1)

Both models use the same tokenizer (distilbert-base-uncased) and were trained using PyTorch and Hugging Face Trainer.


🧠 Model Details

Model Name Label Smoothing Validation Loss Epochs Learning Rate
model_no_smoothing ❌ No 0.1792 1 2e-5
model_label_smoothing βœ… Yes (0.1) 0.5413 1 2e-5
  • Base model: distilbert-base-uncased
  • Task: 4-class topic classification
  • Dataset: AG News (train: 120k, test: 7.6k)

πŸ“¦ Repository Structure


/
β”œβ”€β”€ model\_no\_smoothing/         # Model A - no smoothing
β”œβ”€β”€ model\_label\_smoothing/      # Model B - label smoothing
β”œβ”€β”€ tokenizer/                  # Tokenizer files (shared)
└── README.md

πŸ§ͺ How to Use

Load Model A (No Smoothing)

from transformers import AutoTokenizer, AutoModelForSequenceClassification

model_name = "Koushim/distilbert-agnews/model_no_smoothing"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

inputs = tokenizer("Breaking news in the tech world!", return_tensors="pt")
outputs = model(**inputs)
pred = outputs.logits.argmax(dim=1).item()

Load Model B (Label Smoothing)

model_name = "Koushim/distilbert-agnews/model_label_smoothing"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

🏷️ Class Labels

  1. World
  2. Sports
  3. Business
  4. Sci/Tech

βš™οΈ Training Configuration

  • Framework: PyTorch + πŸ€— Transformers
  • Optimizer: AdamW
  • Batch size: 16 (train/eval)
  • Epochs: 1
  • Learning rate: 2e-5
  • Max sequence length: 256
  • Loss: CrossEntropy (custom for smoothing)

πŸ“„ License

Apache 2.0


✍️ Author

  • Hugging Face: Koushim
  • Trained with transformers.Trainer
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Dataset used to train Koushim/distilbert-agnews

Evaluation results