satyanshu404
commited on
Commit
•
49dd1fb
1
Parent(s):
dfdc38c
End of training
Browse files- README.md +103 -53
- model.safetensors +1 -1
- training_args.bin +1 -1
README.md
CHANGED
@@ -17,8 +17,8 @@ should probably proofread and complete it, then remove this comment. -->
|
|
17 |
|
18 |
This model is a fine-tuned version of [allenai/longformer-base-4096](https://huggingface.co/allenai/longformer-base-4096) on the None dataset.
|
19 |
It achieves the following results on the evaluation set:
|
20 |
-
- Loss:
|
21 |
-
- Accuracy: 0.
|
22 |
|
23 |
## Model description
|
24 |
|
@@ -43,62 +43,112 @@ The following hyperparameters were used during training:
|
|
43 |
- seed: 42
|
44 |
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
45 |
- lr_scheduler_type: linear
|
46 |
-
- num_epochs:
|
47 |
|
48 |
### Training results
|
49 |
|
50 |
| Training Loss | Epoch | Step | Validation Loss | Accuracy |
|
51 |
|:-------------:|:-----:|:-----:|:---------------:|:--------:|
|
52 |
-
| No log | 1.0 | 338 | 0.
|
53 |
-
| 0.
|
54 |
-
| 0.
|
55 |
-
| 0.
|
56 |
-
| 0.
|
57 |
-
| 0.
|
58 |
-
| 0.
|
59 |
-
| 0.
|
60 |
-
| 0.
|
61 |
-
| 0.
|
62 |
-
| 0.
|
63 |
-
| 0.
|
64 |
-
| 0.
|
65 |
-
| 0.
|
66 |
-
| 0.
|
67 |
-
| 0.
|
68 |
-
| 0.
|
69 |
-
| 0.
|
70 |
-
| 0.
|
71 |
-
| 0.
|
72 |
-
| 0.
|
73 |
-
| 0.
|
74 |
-
| 0.
|
75 |
-
| 0.
|
76 |
-
| 0.
|
77 |
-
| 0.
|
78 |
-
| 0.
|
79 |
-
| 0.
|
80 |
-
| 0.
|
81 |
-
| 0.
|
82 |
-
| 0.
|
83 |
-
| 0.
|
84 |
-
| 0.
|
85 |
-
| 0.
|
86 |
-
| 0.
|
87 |
-
| 0.
|
88 |
-
| 0.0 | 37.0 | 12506 |
|
89 |
-
| 0.0 | 38.0 | 12844 | 4.
|
90 |
-
| 0.0 | 39.0 | 13182 | 4.
|
91 |
-
| 0.
|
92 |
-
| 0.
|
93 |
-
| 0.
|
94 |
-
| 0.
|
95 |
-
| 0.
|
96 |
-
| 0.
|
97 |
-
| 0.
|
98 |
-
| 0.
|
99 |
-
| 0.
|
100 |
-
| 0.
|
101 |
-
| 0.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
102 |
|
103 |
|
104 |
### Framework versions
|
|
|
17 |
|
18 |
This model is a fine-tuned version of [allenai/longformer-base-4096](https://huggingface.co/allenai/longformer-base-4096) on the None dataset.
|
19 |
It achieves the following results on the evaluation set:
|
20 |
+
- Loss: 5.2284
|
21 |
+
- Accuracy: 0.6552
|
22 |
|
23 |
## Model description
|
24 |
|
|
|
43 |
- seed: 42
|
44 |
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
45 |
- lr_scheduler_type: linear
|
46 |
+
- num_epochs: 100
|
47 |
|
48 |
### Training results
|
49 |
|
50 |
| Training Loss | Epoch | Step | Validation Loss | Accuracy |
|
51 |
|:-------------:|:-----:|:-----:|:---------------:|:--------:|
|
52 |
+
| No log | 1.0 | 338 | 0.6800 | 0.5448 |
|
53 |
+
| 0.7174 | 2.0 | 676 | 1.4472 | 0.6276 |
|
54 |
+
| 0.865 | 3.0 | 1014 | 1.2742 | 0.6621 |
|
55 |
+
| 0.865 | 4.0 | 1352 | 1.4262 | 0.6621 |
|
56 |
+
| 0.5753 | 5.0 | 1690 | 2.1018 | 0.6414 |
|
57 |
+
| 0.335 | 6.0 | 2028 | 2.4029 | 0.6345 |
|
58 |
+
| 0.335 | 7.0 | 2366 | 1.9533 | 0.6483 |
|
59 |
+
| 0.2503 | 8.0 | 2704 | 2.4815 | 0.6138 |
|
60 |
+
| 0.1785 | 9.0 | 3042 | 2.5177 | 0.6897 |
|
61 |
+
| 0.1785 | 10.0 | 3380 | 2.5533 | 0.6552 |
|
62 |
+
| 0.1067 | 11.0 | 3718 | 2.9023 | 0.6552 |
|
63 |
+
| 0.0957 | 12.0 | 4056 | 3.2890 | 0.6345 |
|
64 |
+
| 0.0957 | 13.0 | 4394 | 3.5851 | 0.6138 |
|
65 |
+
| 0.0166 | 14.0 | 4732 | 3.6766 | 0.5931 |
|
66 |
+
| 0.1395 | 15.0 | 5070 | 3.6210 | 0.6069 |
|
67 |
+
| 0.1395 | 16.0 | 5408 | 3.2261 | 0.6414 |
|
68 |
+
| 0.1005 | 17.0 | 5746 | 3.2913 | 0.6414 |
|
69 |
+
| 0.0793 | 18.0 | 6084 | 3.6091 | 0.6207 |
|
70 |
+
| 0.0793 | 19.0 | 6422 | 2.4907 | 0.6897 |
|
71 |
+
| 0.13 | 20.0 | 6760 | 3.0017 | 0.6552 |
|
72 |
+
| 0.0467 | 21.0 | 7098 | 3.1797 | 0.6759 |
|
73 |
+
| 0.0467 | 22.0 | 7436 | 3.4537 | 0.6414 |
|
74 |
+
| 0.0875 | 23.0 | 7774 | 3.1266 | 0.6414 |
|
75 |
+
| 0.0677 | 24.0 | 8112 | 3.4799 | 0.6759 |
|
76 |
+
| 0.0677 | 25.0 | 8450 | 3.3836 | 0.6690 |
|
77 |
+
| 0.0892 | 26.0 | 8788 | 3.1044 | 0.6483 |
|
78 |
+
| 0.1089 | 27.0 | 9126 | 3.5136 | 0.6552 |
|
79 |
+
| 0.1089 | 28.0 | 9464 | 3.3848 | 0.6483 |
|
80 |
+
| 0.0586 | 29.0 | 9802 | 3.5435 | 0.6621 |
|
81 |
+
| 0.043 | 30.0 | 10140 | 3.6754 | 0.6414 |
|
82 |
+
| 0.043 | 31.0 | 10478 | 3.8983 | 0.6483 |
|
83 |
+
| 0.0026 | 32.0 | 10816 | 3.8528 | 0.6414 |
|
84 |
+
| 0.0195 | 33.0 | 11154 | 3.9876 | 0.6483 |
|
85 |
+
| 0.0195 | 34.0 | 11492 | 2.9999 | 0.6414 |
|
86 |
+
| 0.0781 | 35.0 | 11830 | 3.7963 | 0.6207 |
|
87 |
+
| 0.0552 | 36.0 | 12168 | 4.2694 | 0.6138 |
|
88 |
+
| 0.0 | 37.0 | 12506 | 4.3729 | 0.6138 |
|
89 |
+
| 0.0 | 38.0 | 12844 | 4.4702 | 0.6138 |
|
90 |
+
| 0.0 | 39.0 | 13182 | 4.5190 | 0.6138 |
|
91 |
+
| 0.0125 | 40.0 | 13520 | 4.2951 | 0.6483 |
|
92 |
+
| 0.0125 | 41.0 | 13858 | 3.9059 | 0.6276 |
|
93 |
+
| 0.0709 | 42.0 | 14196 | 3.4919 | 0.6621 |
|
94 |
+
| 0.0362 | 43.0 | 14534 | 4.0863 | 0.6276 |
|
95 |
+
| 0.0362 | 44.0 | 14872 | 3.9934 | 0.6276 |
|
96 |
+
| 0.0311 | 45.0 | 15210 | 4.3174 | 0.6207 |
|
97 |
+
| 0.0163 | 46.0 | 15548 | 4.3117 | 0.6138 |
|
98 |
+
| 0.0163 | 47.0 | 15886 | 4.2067 | 0.6414 |
|
99 |
+
| 0.0235 | 48.0 | 16224 | 3.2403 | 0.6483 |
|
100 |
+
| 0.0512 | 49.0 | 16562 | 3.6099 | 0.6621 |
|
101 |
+
| 0.0512 | 50.0 | 16900 | 3.9438 | 0.6345 |
|
102 |
+
| 0.0002 | 51.0 | 17238 | 4.0551 | 0.6345 |
|
103 |
+
| 0.0 | 52.0 | 17576 | 4.1505 | 0.6345 |
|
104 |
+
| 0.0 | 53.0 | 17914 | 4.2107 | 0.6345 |
|
105 |
+
| 0.0 | 54.0 | 18252 | 4.1841 | 0.5931 |
|
106 |
+
| 0.0493 | 55.0 | 18590 | 4.4524 | 0.6207 |
|
107 |
+
| 0.0493 | 56.0 | 18928 | 4.3673 | 0.6276 |
|
108 |
+
| 0.0172 | 57.0 | 19266 | 4.4991 | 0.6345 |
|
109 |
+
| 0.0002 | 58.0 | 19604 | 4.7284 | 0.6138 |
|
110 |
+
| 0.0002 | 59.0 | 19942 | 4.7207 | 0.6276 |
|
111 |
+
| 0.0004 | 60.0 | 20280 | 4.8372 | 0.6276 |
|
112 |
+
| 0.0132 | 61.0 | 20618 | 5.0463 | 0.6138 |
|
113 |
+
| 0.0132 | 62.0 | 20956 | 4.0695 | 0.6483 |
|
114 |
+
| 0.0294 | 63.0 | 21294 | 4.4791 | 0.6276 |
|
115 |
+
| 0.0234 | 64.0 | 21632 | 4.0409 | 0.6759 |
|
116 |
+
| 0.0234 | 65.0 | 21970 | 4.3323 | 0.6276 |
|
117 |
+
| 0.0311 | 66.0 | 22308 | 4.5133 | 0.6345 |
|
118 |
+
| 0.0069 | 67.0 | 22646 | 4.1708 | 0.6690 |
|
119 |
+
| 0.0069 | 68.0 | 22984 | 4.7436 | 0.6276 |
|
120 |
+
| 0.0001 | 69.0 | 23322 | 4.8199 | 0.6276 |
|
121 |
+
| 0.0011 | 70.0 | 23660 | 5.2157 | 0.5862 |
|
122 |
+
| 0.0011 | 71.0 | 23998 | 5.0111 | 0.6069 |
|
123 |
+
| 0.0279 | 72.0 | 24336 | 4.7120 | 0.6621 |
|
124 |
+
| 0.0 | 73.0 | 24674 | 4.8631 | 0.6207 |
|
125 |
+
| 0.0117 | 74.0 | 25012 | 4.9149 | 0.6276 |
|
126 |
+
| 0.0117 | 75.0 | 25350 | 4.9518 | 0.6276 |
|
127 |
+
| 0.0 | 76.0 | 25688 | 4.9781 | 0.6276 |
|
128 |
+
| 0.0 | 77.0 | 26026 | 5.0057 | 0.6345 |
|
129 |
+
| 0.0 | 78.0 | 26364 | 5.0409 | 0.6345 |
|
130 |
+
| 0.0 | 79.0 | 26702 | 5.0909 | 0.6345 |
|
131 |
+
| 0.0119 | 80.0 | 27040 | 4.4556 | 0.6552 |
|
132 |
+
| 0.0119 | 81.0 | 27378 | 4.5697 | 0.6621 |
|
133 |
+
| 0.0 | 82.0 | 27716 | 4.8371 | 0.6483 |
|
134 |
+
| 0.0 | 83.0 | 28054 | 4.8793 | 0.6483 |
|
135 |
+
| 0.0 | 84.0 | 28392 | 4.9278 | 0.6414 |
|
136 |
+
| 0.0 | 85.0 | 28730 | 4.9605 | 0.6414 |
|
137 |
+
| 0.0 | 86.0 | 29068 | 5.2864 | 0.6207 |
|
138 |
+
| 0.0 | 87.0 | 29406 | 5.3216 | 0.6207 |
|
139 |
+
| 0.0 | 88.0 | 29744 | 5.3452 | 0.6207 |
|
140 |
+
| 0.0 | 89.0 | 30082 | 5.5673 | 0.6069 |
|
141 |
+
| 0.0 | 90.0 | 30420 | 5.3842 | 0.6276 |
|
142 |
+
| 0.0 | 91.0 | 30758 | 5.3997 | 0.6276 |
|
143 |
+
| 0.0 | 92.0 | 31096 | 5.4139 | 0.6276 |
|
144 |
+
| 0.0 | 93.0 | 31434 | 5.4287 | 0.6276 |
|
145 |
+
| 0.0 | 94.0 | 31772 | 5.4433 | 0.6345 |
|
146 |
+
| 0.0 | 95.0 | 32110 | 5.1979 | 0.6552 |
|
147 |
+
| 0.0 | 96.0 | 32448 | 5.2034 | 0.6552 |
|
148 |
+
| 0.0001 | 97.0 | 32786 | 5.2129 | 0.6552 |
|
149 |
+
| 0.0 | 98.0 | 33124 | 5.2220 | 0.6552 |
|
150 |
+
| 0.0 | 99.0 | 33462 | 5.2267 | 0.6552 |
|
151 |
+
| 0.0 | 100.0 | 33800 | 5.2284 | 0.6552 |
|
152 |
|
153 |
|
154 |
### Framework versions
|
model.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 594678184
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:dd90dbe1f78a8b6bcf3743cf89aaf3b17e6415ba2056a17b7b63aec48788a37e
|
3 |
size 594678184
|
training_args.bin
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 4792
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:7861e43acd2871518a82c037f63486e2abcb05928d4e97a5d212bb27f5c329b0
|
3 |
size 4792
|