End of training

dcbed9a verified 7 months ago

4.5 kB

	---
	license: apache-2.0
	base_model: allenai/longformer-base-4096
	tags:
	- generated_from_trainer
	metrics:
	- accuracy
	model-index:
	- name: relevance-classification-v1
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# relevance-classification-v1

	This model is a fine-tuned version of [allenai/longformer-base-4096](https://huggingface.co/allenai/longformer-base-4096) on the None dataset.
	It achieves the following results on the evaluation set:
	- Loss: 4.5156
	- Accuracy: 0.6069

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 2
	- eval_batch_size: 2
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 50

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Accuracy \|
	\|:-------------:\|:-----:\|:-----:\|:---------------:\|:--------:\|
	\| No log \| 1.0 \| 338 \| 0.6547 \| 0.6138 \|
	\| 0.744 \| 2.0 \| 676 \| 1.3339 \| 0.6069 \|
	\| 0.8767 \| 3.0 \| 1014 \| 0.6368 \| 0.6207 \|
	\| 0.8767 \| 4.0 \| 1352 \| 0.8089 \| 0.5931 \|
	\| 0.82 \| 5.0 \| 1690 \| 1.7406 \| 0.6276 \|
	\| 0.7448 \| 6.0 \| 2028 \| 1.5868 \| 0.6345 \|
	\| 0.7448 \| 7.0 \| 2366 \| 1.6950 \| 0.6483 \|
	\| 0.5449 \| 8.0 \| 2704 \| 1.8365 \| 0.6276 \|
	\| 0.4678 \| 9.0 \| 3042 \| 1.9301 \| 0.6069 \|
	\| 0.4678 \| 10.0 \| 3380 \| 2.1818 \| 0.6138 \|
	\| 0.3283 \| 11.0 \| 3718 \| 2.1599 \| 0.6 \|
	\| 0.2159 \| 12.0 \| 4056 \| 2.3001 \| 0.6207 \|
	\| 0.2159 \| 13.0 \| 4394 \| 2.3061 \| 0.6138 \|
	\| 0.1953 \| 14.0 \| 4732 \| 2.5816 \| 0.6069 \|
	\| 0.1241 \| 15.0 \| 5070 \| 2.7310 \| 0.6069 \|
	\| 0.1241 \| 16.0 \| 5408 \| 2.5896 \| 0.6207 \|
	\| 0.1793 \| 17.0 \| 5746 \| 2.7177 \| 0.6207 \|
	\| 0.0978 \| 18.0 \| 6084 \| 2.6936 \| 0.6069 \|
	\| 0.0978 \| 19.0 \| 6422 \| 2.4796 \| 0.6069 \|
	\| 0.175 \| 20.0 \| 6760 \| 3.1355 \| 0.6 \|
	\| 0.1408 \| 21.0 \| 7098 \| 3.0787 \| 0.6069 \|
	\| 0.1408 \| 22.0 \| 7436 \| 3.0301 \| 0.6 \|
	\| 0.1127 \| 23.0 \| 7774 \| 3.5055 \| 0.5793 \|
	\| 0.0812 \| 24.0 \| 8112 \| 2.7603 \| 0.6414 \|
	\| 0.0812 \| 25.0 \| 8450 \| 3.2282 \| 0.5793 \|
	\| 0.078 \| 26.0 \| 8788 \| 3.3855 \| 0.6138 \|
	\| 0.0228 \| 27.0 \| 9126 \| 3.2529 \| 0.6 \|
	\| 0.0228 \| 28.0 \| 9464 \| 3.5188 \| 0.6 \|
	\| 0.0556 \| 29.0 \| 9802 \| 3.3436 \| 0.5931 \|
	\| 0.0564 \| 30.0 \| 10140 \| 3.6578 \| 0.6069 \|
	\| 0.0564 \| 31.0 \| 10478 \| 3.6755 \| 0.6069 \|
	\| 0.0339 \| 32.0 \| 10816 \| 3.5301 \| 0.6138 \|
	\| 0.0273 \| 33.0 \| 11154 \| 3.8414 \| 0.6069 \|
	\| 0.0273 \| 34.0 \| 11492 \| 4.0242 \| 0.6069 \|
	\| 0.0045 \| 35.0 \| 11830 \| 4.2730 \| 0.5931 \|
	\| 0.0503 \| 36.0 \| 12168 \| 3.8472 \| 0.6069 \|
	\| 0.0043 \| 37.0 \| 12506 \| 4.1642 \| 0.5931 \|
	\| 0.0043 \| 38.0 \| 12844 \| 4.2903 \| 0.5931 \|
	\| 0.0 \| 39.0 \| 13182 \| 4.3893 \| 0.5931 \|
	\| 0.0 \| 40.0 \| 13520 \| 4.4723 \| 0.5931 \|
	\| 0.0 \| 41.0 \| 13858 \| 4.4564 \| 0.5931 \|
	\| 0.0088 \| 42.0 \| 14196 \| 4.5376 \| 0.5931 \|
	\| 0.0375 \| 43.0 \| 14534 \| 4.2578 \| 0.6 \|
	\| 0.0375 \| 44.0 \| 14872 \| 4.3456 \| 0.6069 \|
	\| 0.0 \| 45.0 \| 15210 \| 4.3547 \| 0.6069 \|
	\| 0.0002 \| 46.0 \| 15548 \| 4.4010 \| 0.6138 \|
	\| 0.0002 \| 47.0 \| 15886 \| 4.4475 \| 0.6069 \|
	\| 0.0 \| 48.0 \| 16224 \| 4.4869 \| 0.6069 \|
	\| 0.0 \| 49.0 \| 16562 \| 4.5066 \| 0.6069 \|
	\| 0.0 \| 50.0 \| 16900 \| 4.5156 \| 0.6069 \|


	### Framework versions

	- Transformers 4.36.2
	- Pytorch 2.2.2+cu121
	- Datasets 2.16.0
	- Tokenizers 0.15.2

	---
	license: apache-2.0
	base_model: allenai/longformer-base-4096
	tags:
	- generated_from_trainer
	metrics:
	- accuracy
	model-index:
	- name: relevance-classification-v1
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# relevance-classification-v1

	This model is a fine-tuned version of [allenai/longformer-base-4096](https://huggingface.co/allenai/longformer-base-4096) on the None dataset.
	It achieves the following results on the evaluation set:
	- Loss: 4.5156
	- Accuracy: 0.6069

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 2
	- eval_batch_size: 2
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 50

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Accuracy \|
	\|:-------------:\|:-----:\|:-----:\|:---------------:\|:--------:\|
	\| No log \| 1.0 \| 338 \| 0.6547 \| 0.6138 \|
	\| 0.744 \| 2.0 \| 676 \| 1.3339 \| 0.6069 \|
	\| 0.8767 \| 3.0 \| 1014 \| 0.6368 \| 0.6207 \|
	\| 0.8767 \| 4.0 \| 1352 \| 0.8089 \| 0.5931 \|
	\| 0.82 \| 5.0 \| 1690 \| 1.7406 \| 0.6276 \|
	\| 0.7448 \| 6.0 \| 2028 \| 1.5868 \| 0.6345 \|
	\| 0.7448 \| 7.0 \| 2366 \| 1.6950 \| 0.6483 \|
	\| 0.5449 \| 8.0 \| 2704 \| 1.8365 \| 0.6276 \|
	\| 0.4678 \| 9.0 \| 3042 \| 1.9301 \| 0.6069 \|
	\| 0.4678 \| 10.0 \| 3380 \| 2.1818 \| 0.6138 \|
	\| 0.3283 \| 11.0 \| 3718 \| 2.1599 \| 0.6 \|
	\| 0.2159 \| 12.0 \| 4056 \| 2.3001 \| 0.6207 \|
	\| 0.2159 \| 13.0 \| 4394 \| 2.3061 \| 0.6138 \|
	\| 0.1953 \| 14.0 \| 4732 \| 2.5816 \| 0.6069 \|
	\| 0.1241 \| 15.0 \| 5070 \| 2.7310 \| 0.6069 \|
	\| 0.1241 \| 16.0 \| 5408 \| 2.5896 \| 0.6207 \|
	\| 0.1793 \| 17.0 \| 5746 \| 2.7177 \| 0.6207 \|
	\| 0.0978 \| 18.0 \| 6084 \| 2.6936 \| 0.6069 \|
	\| 0.0978 \| 19.0 \| 6422 \| 2.4796 \| 0.6069 \|
	\| 0.175 \| 20.0 \| 6760 \| 3.1355 \| 0.6 \|
	\| 0.1408 \| 21.0 \| 7098 \| 3.0787 \| 0.6069 \|
	\| 0.1408 \| 22.0 \| 7436 \| 3.0301 \| 0.6 \|
	\| 0.1127 \| 23.0 \| 7774 \| 3.5055 \| 0.5793 \|
	\| 0.0812 \| 24.0 \| 8112 \| 2.7603 \| 0.6414 \|
	\| 0.0812 \| 25.0 \| 8450 \| 3.2282 \| 0.5793 \|
	\| 0.078 \| 26.0 \| 8788 \| 3.3855 \| 0.6138 \|
	\| 0.0228 \| 27.0 \| 9126 \| 3.2529 \| 0.6 \|
	\| 0.0228 \| 28.0 \| 9464 \| 3.5188 \| 0.6 \|
	\| 0.0556 \| 29.0 \| 9802 \| 3.3436 \| 0.5931 \|
	\| 0.0564 \| 30.0 \| 10140 \| 3.6578 \| 0.6069 \|
	\| 0.0564 \| 31.0 \| 10478 \| 3.6755 \| 0.6069 \|
	\| 0.0339 \| 32.0 \| 10816 \| 3.5301 \| 0.6138 \|
	\| 0.0273 \| 33.0 \| 11154 \| 3.8414 \| 0.6069 \|
	\| 0.0273 \| 34.0 \| 11492 \| 4.0242 \| 0.6069 \|
	\| 0.0045 \| 35.0 \| 11830 \| 4.2730 \| 0.5931 \|
	\| 0.0503 \| 36.0 \| 12168 \| 3.8472 \| 0.6069 \|
	\| 0.0043 \| 37.0 \| 12506 \| 4.1642 \| 0.5931 \|
	\| 0.0043 \| 38.0 \| 12844 \| 4.2903 \| 0.5931 \|
	\| 0.0 \| 39.0 \| 13182 \| 4.3893 \| 0.5931 \|
	\| 0.0 \| 40.0 \| 13520 \| 4.4723 \| 0.5931 \|
	\| 0.0 \| 41.0 \| 13858 \| 4.4564 \| 0.5931 \|
	\| 0.0088 \| 42.0 \| 14196 \| 4.5376 \| 0.5931 \|
	\| 0.0375 \| 43.0 \| 14534 \| 4.2578 \| 0.6 \|
	\| 0.0375 \| 44.0 \| 14872 \| 4.3456 \| 0.6069 \|
	\| 0.0 \| 45.0 \| 15210 \| 4.3547 \| 0.6069 \|
	\| 0.0002 \| 46.0 \| 15548 \| 4.4010 \| 0.6138 \|
	\| 0.0002 \| 47.0 \| 15886 \| 4.4475 \| 0.6069 \|
	\| 0.0 \| 48.0 \| 16224 \| 4.4869 \| 0.6069 \|
	\| 0.0 \| 49.0 \| 16562 \| 4.5066 \| 0.6069 \|
	\| 0.0 \| 50.0 \| 16900 \| 4.5156 \| 0.6069 \|


	### Framework versions

	- Transformers 4.36.2
	- Pytorch 2.2.2+cu121
	- Datasets 2.16.0
	- Tokenizers 0.15.2