longformer-spans / meta_data /README_s42_e7.md

Training in progress, epoch 1

49c7a0d verified 9 months ago

8.01 kB

	---
	license: apache-2.0
	base_model: allenai/longformer-base-4096
	tags:
	- generated_from_trainer
	datasets:
	- essays_su_g
	metrics:
	- accuracy
	model-index:
	- name: longformer-spans
	results:
	- task:
	name: Token Classification
	type: token-classification
	dataset:
	name: essays_su_g
	type: essays_su_g
	config: spans
	split: train[80%:100%]
	args: spans
	metrics:
	- name: Accuracy
	type: accuracy
	value: 0.9382309279843586
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# longformer-spans

	This model is a fine-tuned version of [allenai/longformer-base-4096](https://huggingface.co/allenai/longformer-base-4096) on the essays_su_g dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.1841
	- B: {'precision': 0.8358744394618834, 'recall': 0.8935762224352828, 'f1-score': 0.8637627432808155, 'support': 1043.0}
	- I: {'precision': 0.9433073515392811, 'recall': 0.9695677233429395, 'f1-score': 0.9562572833470712, 'support': 17350.0}
	- O: {'precision': 0.9409526006227655, 'recall': 0.8843485800997182, 'f1-score': 0.9117729228362295, 'support': 9226.0}
	- Accuracy: 0.9382
	- Macro avg: {'precision': 0.9067114638746433, 'recall': 0.9158308419593135, 'f1-score': 0.9105976498213719, 'support': 27619.0}
	- Weighted avg: {'precision': 0.9384636765600096, 'recall': 0.9382309279843586, 'f1-score': 0.9379045364930165, 'support': 27619.0}

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 8
	- eval_batch_size: 8
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 7

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| B \| I \| O \| Accuracy \| Macro avg \| Weighted avg \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:--------------------------------------------------------------------------------------------------------------------:\|:-------------------------------------------------------------------------------------------------------------------:\|:------------------------------------------------------------------------------------------------------------------:\|:--------:\|:-------------------------------------------------------------------------------------------------------------------:\|:-------------------------------------------------------------------------------------------------------------------:\|
	\| No log \| 1.0 \| 41 \| 0.2970 \| {'precision': 0.8171557562076749, 'recall': 0.34707574304889743, 'f1-score': 0.48721399730820997, 'support': 1043.0} \| {'precision': 0.8802934137966912, 'recall': 0.9752737752161383, 'f1-score': 0.9253527288636114, 'support': 17350.0} \| {'precision': 0.9304752325873774, 'recall': 0.8021894645566876, 'f1-score': 0.861583236321304, 'support': 9226.0} \| 0.8937 \| {'precision': 0.8759748008639145, 'recall': 0.7081796609405745, 'f1-score': 0.7580499874977084, 'support': 27619.0} \| {'precision': 0.8946720981551954, 'recall': 0.893732575400992, 'f1-score': 0.8875050140583102, 'support': 27619.0} \|
	\| No log \| 2.0 \| 82 \| 0.2228 \| {'precision': 0.7610474631751227, 'recall': 0.8916586768935763, 'f1-score': 0.8211920529801324, 'support': 1043.0} \| {'precision': 0.9182955222264335, 'recall': 0.9775216138328531, 'f1-score': 0.946983444540607, 'support': 17350.0} \| {'precision': 0.9614026236125126, 'recall': 0.8261435074788641, 'f1-score': 0.8886557071237029, 'support': 9226.0} \| 0.9237 \| {'precision': 0.8802485363380229, 'recall': 0.898441266068431, 'f1-score': 0.8856104015481474, 'support': 27619.0} \| {'precision': 0.9267569578974372, 'recall': 0.9237119374343749, 'f1-score': 0.9227489636830115, 'support': 27619.0} \|
	\| No log \| 3.0 \| 123 \| 0.1807 \| {'precision': 0.845437616387337, 'recall': 0.8705656759348035, 'f1-score': 0.8578176665092113, 'support': 1043.0} \| {'precision': 0.9587634878973461, 'recall': 0.9474351585014409, 'f1-score': 0.9530656616901, 'support': 17350.0} \| {'precision': 0.9035106382978724, 'recall': 0.9205506178192066, 'f1-score': 0.9119510361859765, 'support': 9226.0} \| 0.9356 \| {'precision': 0.9025705808608517, 'recall': 0.9128504840851503, 'f1-score': 0.907611454795096, 'support': 27619.0} \| {'precision': 0.9360269053132667, 'recall': 0.9355516130200224, 'f1-score': 0.9357345782375959, 'support': 27619.0} \|
	\| No log \| 4.0 \| 164 \| 0.2177 \| {'precision': 0.8223028105167725, 'recall': 0.8696069031639502, 'f1-score': 0.8452935694315005, 'support': 1043.0} \| {'precision': 0.9182645433864154, 'recall': 0.9771181556195966, 'f1-score': 0.9467776164414164, 'support': 17350.0} \| {'precision': 0.9526943133846536, 'recall': 0.8316713635378279, 'f1-score': 0.8880787037037038, 'support': 9226.0} \| 0.9245 \| {'precision': 0.8977538890959472, 'recall': 0.8927988074404581, 'f1-score': 0.8933832965255403, 'support': 27619.0} \| {'precision': 0.9261417645247878, 'recall': 0.9244722835729027, 'f1-score': 0.9233370852871574, 'support': 27619.0} \|
	\| No log \| 5.0 \| 205 \| 0.1864 \| {'precision': 0.8298059964726632, 'recall': 0.9022051773729626, 'f1-score': 0.8644924207625172, 'support': 1043.0} \| {'precision': 0.9426901899089786, 'recall': 0.9670317002881844, 'f1-score': 0.9547058154091271, 'support': 17350.0} \| {'precision': 0.9384137216530448, 'recall': 0.8835898547582918, 'f1-score': 0.9101769664489477, 'support': 9226.0} \| 0.9367 \| {'precision': 0.9036366360115622, 'recall': 0.9176089108064795, 'f1-score': 0.909791734206864, 'support': 27619.0} \| {'precision': 0.9369987126692769, 'recall': 0.9367102357073029, 'f1-score': 0.9364243522452534, 'support': 27619.0} \|
	\| No log \| 6.0 \| 246 \| 0.1768 \| {'precision': 0.8413417951042611, 'recall': 0.8897411313518696, 'f1-score': 0.8648648648648648, 'support': 1043.0} \| {'precision': 0.9434724091520862, 'recall': 0.9696829971181556, 'f1-score': 0.9563981581490535, 'support': 17350.0} \| {'precision': 0.9409258406264395, 'recall': 0.885649252113592, 'f1-score': 0.9124511446119487, 'support': 9226.0} \| 0.9386 \| {'precision': 0.908580014960929, 'recall': 0.9150244601945391, 'f1-score': 0.9112380558752889, 'support': 27619.0} \| {'precision': 0.9387648936131638, 'recall': 0.9385929975741337, 'f1-score': 0.938261209968861, 'support': 27619.0} \|
	\| No log \| 7.0 \| 287 \| 0.1841 \| {'precision': 0.8358744394618834, 'recall': 0.8935762224352828, 'f1-score': 0.8637627432808155, 'support': 1043.0} \| {'precision': 0.9433073515392811, 'recall': 0.9695677233429395, 'f1-score': 0.9562572833470712, 'support': 17350.0} \| {'precision': 0.9409526006227655, 'recall': 0.8843485800997182, 'f1-score': 0.9117729228362295, 'support': 9226.0} \| 0.9382 \| {'precision': 0.9067114638746433, 'recall': 0.9158308419593135, 'f1-score': 0.9105976498213719, 'support': 27619.0} \| {'precision': 0.9384636765600096, 'recall': 0.9382309279843586, 'f1-score': 0.9379045364930165, 'support': 27619.0} \|


	### Framework versions

	- Transformers 4.37.2
	- Pytorch 2.2.0+cu121
	- Datasets 2.17.0
	- Tokenizers 0.15.2