--- license: apache-2.0 base_model: allenai/longformer-base-4096 tags: - generated_from_trainer datasets: - essays_su_g metrics: - accuracy model-index: - name: longformer-spans results: - task: name: Token Classification type: token-classification dataset: name: essays_su_g type: essays_su_g config: spans split: train[80%:100%] args: spans metrics: - name: Accuracy type: accuracy value: 0.939172308917774 --- # longformer-spans This model is a fine-tuned version of [allenai/longformer-base-4096](https://huggingface.co/allenai/longformer-base-4096) on the essays_su_g dataset. It achieves the following results on the evaluation set: - Loss: 0.2166 - B: {'precision': 0.8636788048552755, 'recall': 0.8868648130393096, 'f1-score': 0.8751182592242194, 'support': 1043.0} - I: {'precision': 0.948943661971831, 'recall': 0.9630547550432277, 'f1-score': 0.9559471365638768, 'support': 17350.0} - O: {'precision': 0.9289709172259508, 'recall': 0.9001734229351832, 'f1-score': 0.9143454805680943, 'support': 9226.0} - Accuracy: 0.9392 - Macro avg: {'precision': 0.9138644613510191, 'recall': 0.9166976636725735, 'f1-score': 0.9151369587853968, 'support': 27619.0} - Weighted avg: {'precision': 0.9390519284189125, 'recall': 0.939172308917774, 'f1-score': 0.9389978843359775, 'support': 27619.0} ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 2e-05 - train_batch_size: 8 - eval_batch_size: 8 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - num_epochs: 10 ### Training results | Training Loss | Epoch | Step | Validation Loss | B | I | O | Accuracy | Macro avg | Weighted avg | |:-------------:|:-----:|:----:|:---------------:|:------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------------------:|:--------:|:-------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:| | No log | 1.0 | 41 | 0.2992 | {'precision': 0.7993138936535163, 'recall': 0.4467881112176414, 'f1-score': 0.5731857318573186, 'support': 1043.0} | {'precision': 0.8829387840233601, 'recall': 0.9759654178674352, 'f1-score': 0.9271243977222952, 'support': 17350.0} | {'precision': 0.9361160600661746, 'recall': 0.7973119445046607, 'f1-score': 0.8611566377897449, 'support': 9226.0} | 0.8963 | {'precision': 0.8727895792476836, 'recall': 0.7400218245299124, 'f1-score': 0.7871555891231196, 'support': 27619.0} | {'precision': 0.8975444101544748, 'recall': 0.8963032694883957, 'f1-score': 0.8917220811418657, 'support': 27619.0} | | No log | 2.0 | 82 | 0.2008 | {'precision': 0.7899231426131511, 'recall': 0.8868648130393096, 'f1-score': 0.8355916892502258, 'support': 1043.0} | {'precision': 0.9329899761865205, 'recall': 0.9710086455331413, 'f1-score': 0.951619736210354, 'support': 17350.0} | {'precision': 0.9478012155881301, 'recall': 0.862020377194884, 'f1-score': 0.9028779020264517, 'support': 9226.0} | 0.9314 | {'precision': 0.8902381114626006, 'recall': 0.9066312785891116, 'f1-score': 0.8966964424956773, 'support': 27619.0} | {'precision': 0.9325348470110335, 'recall': 0.9314240196965857, 'f1-score': 0.9309560838275704, 'support': 27619.0} | | No log | 3.0 | 123 | 0.1754 | {'precision': 0.8574144486692015, 'recall': 0.8648130393096836, 'f1-score': 0.8610978520286395, 'support': 1043.0} | {'precision': 0.9637472869126532, 'recall': 0.9469164265129683, 'f1-score': 0.9552577259644737, 'support': 17350.0} | {'precision': 0.9027310924369748, 'recall': 0.9314979406026447, 'f1-score': 0.916888936306412, 'support': 9226.0} | 0.9387 | {'precision': 0.9079642760062764, 'recall': 0.9144091354750987, 'f1-score': 0.9110815047665084, 'support': 27619.0} | {'precision': 0.9393495693805003, 'recall': 0.9386654114920888, 'f1-score': 0.9388849680116025, 'support': 27619.0} | | No log | 4.0 | 164 | 0.1737 | {'precision': 0.8583732057416268, 'recall': 0.8600191754554171, 'f1-score': 0.8591954022988505, 'support': 1043.0} | {'precision': 0.9443852068017284, 'recall': 0.9699135446685879, 'f1-score': 0.9569791577810003, 'support': 17350.0} | {'precision': 0.9394631639063392, 'recall': 0.8915022761760243, 'f1-score': 0.9148545687114177, 'support': 9226.0} | 0.9396 | {'precision': 0.9140738588165648, 'recall': 0.9071449987666765, 'f1-score': 0.9103430429304228, 'support': 27619.0} | {'precision': 0.9394928759838658, 'recall': 0.9395705854665267, 'f1-score': 0.939214940549245, 'support': 27619.0} | | No log | 5.0 | 205 | 0.2081 | {'precision': 0.8590225563909775, 'recall': 0.8763183125599233, 'f1-score': 0.8675842429995253, 'support': 1043.0} | {'precision': 0.9336273428886439, 'recall': 0.9761383285302594, 'f1-score': 0.9544096928712315, 'support': 17350.0} | {'precision': 0.9511586452762923, 'recall': 0.8675482332538478, 'f1-score': 0.9074315514993481, 'support': 9226.0} | 0.9361 | {'precision': 0.9146028481853046, 'recall': 0.9066682914480101, 'f1-score': 0.9098084957900351, 'support': 27619.0} | {'precision': 0.9366662292897221, 'recall': 0.9360947174046852, 'f1-score': 0.9354379967014502, 'support': 27619.0} | | No log | 6.0 | 246 | 0.1913 | {'precision': 0.8325991189427313, 'recall': 0.9060402684563759, 'f1-score': 0.8677685950413224, 'support': 1043.0} | {'precision': 0.935986255057363, 'recall': 0.973371757925072, 'f1-score': 0.9543129997457125, 'support': 17350.0} | {'precision': 0.9491766378391185, 'recall': 0.8684153479297637, 'f1-score': 0.9070017546838739, 'support': 9226.0} | 0.9358 | {'precision': 0.9059206706130709, 'recall': 0.9159424581037373, 'f1-score': 0.9096944498236362, 'support': 27619.0} | {'precision': 0.9364881446470265, 'recall': 0.9357688547738875, 'f1-score': 0.9352406451692542, 'support': 27619.0} | | No log | 7.0 | 287 | 0.1970 | {'precision': 0.8484848484848485, 'recall': 0.8859060402684564, 'f1-score': 0.8667917448405252, 'support': 1043.0} | {'precision': 0.9405801971326165, 'recall': 0.9680115273775216, 'f1-score': 0.9540987331704823, 'support': 17350.0} | {'precision': 0.9371685496887249, 'recall': 0.8810969000650336, 'f1-score': 0.9082681564245809, 'support': 9226.0} | 0.9359 | {'precision': 0.9087445317687299, 'recall': 0.9116714892370039, 'f1-score': 0.9097195448118628, 'support': 27619.0} | {'precision': 0.9359626762970696, 'recall': 0.9358774756508201, 'f1-score': 0.9354921909391984, 'support': 27619.0} | | No log | 8.0 | 328 | 0.2042 | {'precision': 0.8507734303912647, 'recall': 0.8964525407478428, 'f1-score': 0.8730158730158729, 'support': 1043.0} | {'precision': 0.9413907099232364, 'recall': 0.96835734870317, 'f1-score': 0.9546836378100406, 'support': 17350.0} | {'precision': 0.9384296091317883, 'recall': 0.8821807934099285, 'f1-score': 0.9094362813565005, 'support': 9226.0} | 0.9369 | {'precision': 0.9101979164820965, 'recall': 0.9156635609536471, 'f1-score': 0.9123785973941381, 'support': 27619.0} | {'precision': 0.9369795097185314, 'recall': 0.936855063543213, 'f1-score': 0.9364848764747034, 'support': 27619.0} | | No log | 9.0 | 369 | 0.2107 | {'precision': 0.8516483516483516, 'recall': 0.8916586768935763, 'f1-score': 0.8711943793911008, 'support': 1043.0} | {'precision': 0.9517556380245504, 'recall': 0.960806916426513, 'f1-score': 0.9562598594579091, 'support': 17350.0} | {'precision': 0.9260985352862849, 'recall': 0.9046173856492521, 'f1-score': 0.9152319333260226, 'support': 9226.0} | 0.9394 | {'precision': 0.9098341749863957, 'recall': 0.9190276596564471, 'f1-score': 0.9142287240583441, 'support': 27619.0} | {'precision': 0.9394045634181702, 'recall': 0.9394257576306166, 'f1-score': 0.9393422685892149, 'support': 27619.0} | | No log | 10.0 | 410 | 0.2166 | {'precision': 0.8636788048552755, 'recall': 0.8868648130393096, 'f1-score': 0.8751182592242194, 'support': 1043.0} | {'precision': 0.948943661971831, 'recall': 0.9630547550432277, 'f1-score': 0.9559471365638768, 'support': 17350.0} | {'precision': 0.9289709172259508, 'recall': 0.9001734229351832, 'f1-score': 0.9143454805680943, 'support': 9226.0} | 0.9392 | {'precision': 0.9138644613510191, 'recall': 0.9166976636725735, 'f1-score': 0.9151369587853968, 'support': 27619.0} | {'precision': 0.9390519284189125, 'recall': 0.939172308917774, 'f1-score': 0.9389978843359775, 'support': 27619.0} | ### Framework versions - Transformers 4.37.2 - Pytorch 2.2.0+cu121 - Datasets 2.17.0 - Tokenizers 0.15.2