longformer-simple / meta_data /README_s42_e6.md

Theoreticallyhugo

Training in progress, epoch 1

55a8b23 verified 7 months ago

preview code

raw

history blame

No virus

8.41 kB

	---
	base_model: allenai/longformer-base-4096
	tags:
	- generated_from_trainer
	datasets:
	- essays_su_g
	metrics:
	- accuracy
	model-index:
	- name: longformer-simple
	results:
	- task:
	name: Token Classification
	type: token-classification
	dataset:
	name: essays_su_g
	type: essays_su_g
	config: simple
	split: train[80%:100%]
	args: simple
	metrics:
	- name: Accuracy
	type: accuracy
	value: 0.8417393823092798
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# longformer-simple

	This model is a fine-tuned version of [allenai/longformer-base-4096](https://huggingface.co/allenai/longformer-base-4096) on the essays_su_g dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.4340
	- Claim: {'precision': 0.6054216867469879, 'recall': 0.5786948176583493, 'f1-score': 0.591756624141315, 'support': 4168.0}
	- Majorclaim: {'precision': 0.7709074733096085, 'recall': 0.8052973977695167, 'f1-score': 0.7877272727272727, 'support': 2152.0}
	- O: {'precision': 0.9340387212967132, 'recall': 0.8994146975937568, 'f1-score': 0.916399779127554, 'support': 9226.0}
	- Premise: {'precision': 0.8641925937774934, 'recall': 0.8949722521328585, 'f1-score': 0.8793131510416666, 'support': 12073.0}
	- Accuracy: 0.8417
	- Macro avg: {'precision': 0.7936401187827008, 'recall': 0.7945947912886202, 'f1-score': 0.793799206759452, 'support': 27619.0}
	- Weighted avg: {'precision': 0.8412045657077691, 'recall': 0.8417393823092798, 'f1-score': 0.8411703079433343, 'support': 27619.0}

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 8
	- eval_batch_size: 8
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 6

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Claim \| Majorclaim \| O \| Premise \| Accuracy \| Macro avg \| Weighted avg \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:--------------------------------------------------------------------------------------------------------------------:\|:------------------------------------------------------------------------------------------------------------------:\|:------------------------------------------------------------------------------------------------------------------:\|:-------------------------------------------------------------------------------------------------------------------:\|:--------:\|:-------------------------------------------------------------------------------------------------------------------:\|:-------------------------------------------------------------------------------------------------------------------:\|
	\| No log \| 1.0 \| 41 \| 0.6062 \| {'precision': 0.46017699115044247, 'recall': 0.21209213051823417, 'f1-score': 0.2903596649696173, 'support': 4168.0} \| {'precision': 0.6158224245873648, 'recall': 0.5027881040892194, 'f1-score': 0.5535942696341775, 'support': 2152.0} \| {'precision': 0.8984457169568774, 'recall': 0.8332972035551701, 'f1-score': 0.8646460102344935, 'support': 9226.0} \| {'precision': 0.751105044201768, 'recall': 0.9570943427482813, 'f1-score': 0.841679717376261, 'support': 12073.0} \| 0.7679 \| {'precision': 0.6813875442241132, 'recall': 0.6263179452277262, 'f1-score': 0.6375699155536373, 'support': 27619.0} \| {'precision': 0.745878523484527, 'recall': 0.7679133929541258, 'f1-score': 0.7437045972031264, 'support': 27619.0} \|
	\| No log \| 2.0 \| 82 \| 0.4588 \| {'precision': 0.5838409746713691, 'recall': 0.43690019193857965, 'f1-score': 0.49979415397282845, 'support': 4168.0} \| {'precision': 0.6924335378323109, 'recall': 0.7867100371747212, 'f1-score': 0.736567326517294, 'support': 2152.0} \| {'precision': 0.9328012953967152, 'recall': 0.8741599826577064, 'f1-score': 0.9025290957923008, 'support': 9226.0} \| {'precision': 0.8268327242896562, 'recall': 0.9183301582042575, 'f1-score': 0.8701828741856997, 'support': 12073.0} \| 0.8207 \| {'precision': 0.7589771330475128, 'recall': 0.7540250924938162, 'f1-score': 0.7522683626170307, 'support': 27619.0} \| {'precision': 0.8150889745292919, 'recall': 0.8206669321843658, 'f1-score': 0.8146814221459026, 'support': 27619.0} \|
	\| No log \| 3.0 \| 123 \| 0.4322 \| {'precision': 0.5977704127749323, 'recall': 0.4760076775431862, 'f1-score': 0.5299853078669694, 'support': 4168.0} \| {'precision': 0.7029702970297029, 'recall': 0.824814126394052, 'f1-score': 0.7590335685268335, 'support': 2152.0} \| {'precision': 0.9453125, 'recall': 0.8787123347062649, 'f1-score': 0.9107965397146388, 'support': 9226.0} \| {'precision': 0.8376392150920524, 'recall': 0.9157624451254867, 'f1-score': 0.8749604305159862, 'support': 12073.0} \| 0.8299 \| {'precision': 0.7709231062241719, 'recall': 0.7738241459422475, 'f1-score': 0.7686939616561069, 'support': 27619.0} \| {'precision': 0.826915186229052, 'recall': 0.8299359136826098, 'f1-score': 0.8258381967372473, 'support': 27619.0} \|
	\| No log \| 4.0 \| 164 \| 0.4234 \| {'precision': 0.6074243579964403, 'recall': 0.5731765834932822, 'f1-score': 0.5898037279348228, 'support': 4168.0} \| {'precision': 0.8064516129032258, 'recall': 0.7202602230483272, 'f1-score': 0.7609229258713793, 'support': 2152.0} \| {'precision': 0.897263864136702, 'recall': 0.9277043138955127, 'f1-score': 0.9122302158273382, 'support': 9226.0} \| {'precision': 0.8721472392638037, 'recall': 0.8831276401888511, 'f1-score': 0.8776030949049305, 'support': 12073.0} \| 0.8386 \| {'precision': 0.7958217685750428, 'recall': 0.7760671901564933, 'f1-score': 0.7851399911346177, 'support': 27619.0} \| {'precision': 0.8354690113781823, 'recall': 0.8385531699192584, 'f1-score': 0.8366467363234656, 'support': 27619.0} \|
	\| No log \| 5.0 \| 205 \| 0.4306 \| {'precision': 0.6152236463510332, 'recall': 0.564299424184261, 'f1-score': 0.5886622450256539, 'support': 4168.0} \| {'precision': 0.7490330898152128, 'recall': 0.8099442379182156, 'f1-score': 0.7782987273945077, 'support': 2152.0} \| {'precision': 0.9314760727926309, 'recall': 0.8987643615868198, 'f1-score': 0.9148278905560459, 'support': 9226.0} \| {'precision': 0.863292750855415, 'recall': 0.89861674811563, 'f1-score': 0.8806006493506493, 'support': 12073.0} \| 0.8413 \| {'precision': 0.789756389953573, 'recall': 0.7929061929512317, 'f1-score': 0.7905973780817142, 'support': 27619.0} \| {'precision': 0.8397300045597481, 'recall': 0.8413048988015497, 'f1-score': 0.8400064034360539, 'support': 27619.0} \|
	\| No log \| 6.0 \| 246 \| 0.4340 \| {'precision': 0.6054216867469879, 'recall': 0.5786948176583493, 'f1-score': 0.591756624141315, 'support': 4168.0} \| {'precision': 0.7709074733096085, 'recall': 0.8052973977695167, 'f1-score': 0.7877272727272727, 'support': 2152.0} \| {'precision': 0.9340387212967132, 'recall': 0.8994146975937568, 'f1-score': 0.916399779127554, 'support': 9226.0} \| {'precision': 0.8641925937774934, 'recall': 0.8949722521328585, 'f1-score': 0.8793131510416666, 'support': 12073.0} \| 0.8417 \| {'precision': 0.7936401187827008, 'recall': 0.7945947912886202, 'f1-score': 0.793799206759452, 'support': 27619.0} \| {'precision': 0.8412045657077691, 'recall': 0.8417393823092798, 'f1-score': 0.8411703079433343, 'support': 27619.0} \|


	### Framework versions

	- Transformers 4.37.2
	- Pytorch 2.2.0+cu121
	- Datasets 2.17.0
	- Tokenizers 0.15.2