End of training

Files changed (15) hide show

README.md ADDED Viewed

+---
+library_name: transformers
+license: mit
+base_model: prajjwal1/bert-tiny
+tags:
+- generated_from_trainer
+model-index:
+- name: my_awesome_swag_model
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# my_awesome_swag_model
+This model is a fine-tuned version of [prajjwal1/bert-tiny](https://huggingface.co/prajjwal1/bert-tiny) on an unknown dataset.
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 5e-05
+- train_batch_size: 8
+- eval_batch_size: 8
+- seed: 42
+- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
+- lr_scheduler_type: linear
+- num_epochs: 1
+### Training results
+| Training Loss | Epoch | Step | Validation Loss | Accuracy |
+|:-------------:|:-----:|:----:|:---------------:|:--------:|
+| No log        | 1.0   | 3    | 1.3831          | 0.25     |
+### Framework versions
+- Transformers 4.47.1
+- Pytorch 2.5.1+cu121
+- Datasets 3.2.0
+- Tokenizers 0.21.0

config.json ADDED Viewed

+{
+  "_name_or_path": "prajjwal1/bert-tiny",
+  "architectures": [
+    "BertForMultipleChoice"
+  ],
+  "attention_probs_dropout_prob": 0.1,
+  "classifier_dropout": null,
+  "hidden_act": "gelu",
+  "hidden_dropout_prob": 0.1,
+  "hidden_size": 128,
+  "initializer_range": 0.02,
+  "intermediate_size": 512,
+  "layer_norm_eps": 1e-12,
+  "max_position_embeddings": 512,
+  "model_type": "bert",
+  "num_attention_heads": 2,
+  "num_hidden_layers": 2,
+  "pad_token_id": 0,
+  "position_embedding_type": "absolute",
+  "torch_dtype": "float32",
+  "transformers_version": "4.47.1",
+  "type_vocab_size": 2,
+  "use_cache": true,
+  "vocab_size": 30522
+}

model.safetensors ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:ebd50608af736b607e1f8e9aa71fc1190dbba1eea1e90178d849cb143092e0ee
+size 17548796

runs/Jan07_12-49-38_97a669587ffa/events.out.tfevents.1736254178.97a669587ffa.1316.0 ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:67558fbabfce6d3585de4acc70feba411c91e136d4b08b7210eb5ea25dd2dfd1
+size 5067

runs/Jan07_13-20-17_97a669587ffa/events.out.tfevents.1736256064.97a669587ffa.1316.1 ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:4bd75e40956687641479575a77671602f2682a351700bd8bd0cbe54c3430a9f2
+size 5067

runs/Jan07_13-20-17_97a669587ffa/events.out.tfevents.1736256266.97a669587ffa.1316.2 ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:7bb74517b2997ca1869438f0f6a17123692c35fdee2c91a6d69c00ae626c7db2
+size 5701

runs/Jan07_13-27-17_97a669587ffa/events.out.tfevents.1736256443.97a669587ffa.1316.3 ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:885333427dafcfbe9a7f86de0282a18fa2593446bafcccb8c96487c369274338
+size 5061

runs/Jan07_13-30-48_97a669587ffa/events.out.tfevents.1736256657.97a669587ffa.1316.4 ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:1773fac60b14388b627f25bb6e58363a881a3759ff0122047e2639a7aa692aed
+size 5646

runs/Jan07_13-30-48_97a669587ffa/events.out.tfevents.1736256815.97a669587ffa.1316.5 ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:857ae92571f288c26fdefdcc264e7e24b8d7ab4f71e1291dfe230bb5b4f9f99f
+size 5688

runs/Jan07_13-33-22_97a669587ffa/events.out.tfevents.1736257224.97a669587ffa.1316.6 ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:426d5f3121b1b18c947bf65fb317c8b9396d7fdadc6f10af03a2bb6f4b01113e
+size 5687

special_tokens_map.json ADDED Viewed

+{
+  "cls_token": "[CLS]",
+  "mask_token": "[MASK]",
+  "pad_token": "[PAD]",
+  "sep_token": "[SEP]",
+  "unk_token": "[UNK]"
+}

tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

tokenizer_config.json ADDED Viewed

+{
+  "added_tokens_decoder": {
+    "0": {
+      "content": "[PAD]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "100": {
+      "content": "[UNK]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "101": {
+      "content": "[CLS]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "102": {
+      "content": "[SEP]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "103": {
+      "content": "[MASK]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
+  },
+  "clean_up_tokenization_spaces": false,
+  "cls_token": "[CLS]",
+  "do_lower_case": true,
+  "extra_special_tokens": {},
+  "mask_token": "[MASK]",
+  "model_max_length": 512,
+  "pad_token": "[PAD]",
+  "sep_token": "[SEP]",
+  "strip_accents": null,
+  "tokenize_chinese_chars": true,
+  "tokenizer_class": "BertTokenizer",
+  "unk_token": "[UNK]"
+}

training_args.bin ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:bc9fe57f05c2f4c75cc1e63666e87dd33a4da3551186f251e2833af2f9a5a239
+size 5368

vocab.txt ADDED Viewed

The diff for this file is too large to render. See raw diff