Update model.py
#1
by
KoichiYasuoka
- opened
IMHO dense
layer is not necessary for token classification as described in https://qiita.com/KoichiYasuoka/items/751c02216a65d105d3d2
Thanks, I will test it today :)
I ran some experiments, and I am seeing a performance degregation with this solution:
Configuration | Run 1 | Run 2 | Run 3 | Run 4 | Run 5 | Avg. |
---|---|---|---|---|---|---|
bs=16,e=10,lr=1e-05 |
95.71 | 95.42 | 95.53 | 95.56 | 95.43 | 95.53 |
bs=16,e=10,lr=1e-05 - No Dense |
95.24 | 95.23 | 95.01 | 94.98 | 95.18 | 95.12 |
Thank you for trying and now I understand that dense
makes it better...
KoichiYasuoka
changed pull request status to
closed