metadata

library_name: transformers
tags:
  - generated_from_trainer
  - mnli
  - text-classification
  - bert
metrics:
  - accuracy
  - f1
model-index:
  - name: mnli-finetuned-bert-base-cased
    results:
      - task:
          type: text-classification
          name: Natural Language Inference
        dataset:
          name: MultiNLI
          type: nyu-mll/glue
        metrics:
          - name: Accuracy
            type: accuracy
            value: 0.6368
          - name: F1
            type: f1
            value: 0.6358
license: mit
datasets:
  - nyu-mll/glue
language:
  - en
base_model:
  - google-bert/bert-base-cased
pipeline_tag: text-classification

soonbob/mnli-finetuned-bert-base-cased

MNLI 데이터셋을 학습시킨 BERT 파인튜닝 연습용으로 만든 것입니다.

This is a BERT-based model fine-tuned on the Multi-Genre Natural Language Inference (MultiNLI) dataset for the task of natural language inference (NLI), using Hugging Face's Trainer.

It classifies a pair of sentences into one of the following classes:

entailment
neutral
contradiction

🧠 Intended Use

This model can be used for:

Evaluating whether one sentence logically follows from another
Sentence-pair classification tasks
Transfer learning for other NLI-style problems

It achieves the following results on the evaluation set:

Loss: 0.8276
Accuracy: 0.6368
F1: 0.6358

⚙️ Training Details

Base model: bert-base-cased
Dataset: nyu-mll/glue, subset: mnli
Epochs: 3
Learning rate: 1e-3
Optimizer: AdamW
Scheduler: Linear

🏋️ Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 128
eval_batch_size: 128
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 3

🏋️ Training Logs

Training Loss	Epoch	Step	Validation Loss	Accuracy	F1
0.8662	1.0	2455	0.8682	0.6033	0.5946
0.7964	2.0	4910	0.8449	0.6242	0.6242
0.7323	3.0	7365	0.8673	0.6237	0.6231

Framework versions

Transformers 4.50.3
Pytorch 2.6.0+cu124
Datasets 3.5.0
Tokenizers 0.21.1