bert-20news-classification

This model is a fine-tuned version of distilbert-base-uncased on an unknown dataset. It achieves the following results on the evaluation set:

  • Train Loss: 0.0479
  • Train Accuracy: 0.9922
  • Validation Loss: 0.2769
  • Validation Accuracy: 0.9284
  • Epoch: 9

Model description

This model is a fine-tuned version of the DistilBERT model for sequence classification tasks. It was trained using Hugging Face's transformers and TensorFlow. The model expects input sequences to be tokenized according to the DistilBERT's tokenizer.

The model was trained specifically for classifying text into 20 different categories derived from the 20 Newsgroups dataset. These categories include various topics such as 'alt.atheism', 'comp.graphics', 'comp.os.ms-windows.misc', 'comp.sys.ibm.pc.hardware', 'comp.sys.mac.hardware', 'comp.windows.x', 'misc.forsale', 'rec.autos', 'rec.motorcycles', 'rec.sport.baseball', 'rec.sport.hockey', 'sci.crypt', 'sci.electronics', 'sci.med', 'sci.space', 'soc.religion.christian', 'talk.politics.guns', 'talk.politics.mideast', 'talk.politics.misc', 'talk.religion.misc'.

Intended uses & limitations

This model is intended for classifying text into the above mentioned 20 categories. It can be used for categorizing text data from similar domains or topics.

Training and evaluation data

the model was trained on 90% of the data from the 20 Newsgroups dataset, with the remaining 10% used for validation.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • optimizer: {'name': 'Adam', 'weight_decay': None, 'clipnorm': None, 'global_clipnorm': None, 'clipvalue': None, 'use_ema': False, 'ema_momentum': 0.99, 'ema_overwrite_frequency': None, 'jit_compile': True, 'is_legacy_optimizer': False, 'learning_rate': {'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 2e-05, 'decay_steps': 2120, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}}, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False}
  • training_precision: float32

Training results

Train Loss Train Accuracy Validation Loss Validation Accuracy Epoch
1.8498 0.5829 0.9285 0.8012 0
0.6611 0.8406 0.4800 0.8807 1
0.3563 0.9128 0.3829 0.9002 2
0.2276 0.9475 0.3593 0.9072 3
0.1544 0.9659 0.3205 0.9214 4
0.1094 0.9779 0.3007 0.9214 5
0.0825 0.9846 0.2821 0.9258 6
0.0634 0.9895 0.2754 0.9337 7
0.0533 0.9916 0.2707 0.9337 8
0.0479 0.9922 0.2769 0.9284 9

Framework versions

  • Transformers 4.28.0
  • TensorFlow 2.12.0
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
47
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.