devngho commited on
Commit
932a0e1
1 Parent(s): 48f010b

Model save

Browse files
README.md CHANGED
@@ -1,5 +1,5 @@
1
  ---
2
- license: mit
3
  base_model: lemon-mint/LaBSE-EnKo-Nano-Preview-v0.3
4
  tags:
5
  - generated_from_trainer
@@ -15,11 +15,15 @@ model-index:
15
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
16
  should probably proofread and complete it, then remove this comment. -->
17
 
18
- It's a training checkpoint. I strongly recommend not to use this model 🤗
19
-
20
  # ko-edu-classifier
21
 
22
  This model is a fine-tuned version of [lemon-mint/LaBSE-EnKo-Nano-Preview-v0.3](https://huggingface.co/lemon-mint/LaBSE-EnKo-Nano-Preview-v0.3) on the None dataset.
 
 
 
 
 
 
23
 
24
  ## Model description
25
 
@@ -38,49 +42,230 @@ More information needed
38
  ### Training hyperparameters
39
 
40
  The following hyperparameters were used during training:
41
- - learning_rate: 0.0003
42
- - train_batch_size: 256
43
- - eval_batch_size: 256
44
  - seed: 0
45
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
46
  - lr_scheduler_type: cosine
47
  - lr_scheduler_warmup_ratio: 0.2
48
- - num_epochs: 30
49
 
50
  ### Training results
51
 
52
- | Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1 Macro | Accuracy |
53
- |:-------------:|:-------:|:----:|:---------------:|:---------:|:------:|:--------:|:--------:|
54
- | 8.2847 | 0.9922 | 128 | 5.7695 | 0.0554 | 0.1667 | 0.0832 | 0.3326 |
55
- | 2.9466 | 1.9845 | 256 | 2.4992 | 0.0297 | 0.1667 | 0.0504 | 0.1783 |
56
- | 2.2442 | 2.9767 | 384 | 2.2770 | 0.0972 | 0.1779 | 0.0789 | 0.1884 |
57
- | 2.11 | 3.9690 | 512 | 2.2539 | 0.1370 | 0.1917 | 0.1233 | 0.1966 |
58
- | 2.0444 | 4.9612 | 640 | 1.9768 | 0.2723 | 0.2069 | 0.1448 | 0.2171 |
59
- | 2.0458 | 5.9535 | 768 | 2.1823 | 0.1460 | 0.2022 | 0.1450 | 0.2021 |
60
- | 2.0249 | 6.9457 | 896 | 2.0237 | 0.2773 | 0.2019 | 0.1478 | 0.2062 |
61
- | 2.0141 | 7.9380 | 1024 | 2.0108 | 0.3220 | 0.2043 | 0.1498 | 0.2081 |
62
- | 2.0178 | 8.9302 | 1152 | 1.9606 | 0.2890 | 0.2066 | 0.1513 | 0.2127 |
63
- | 2.0145 | 9.9225 | 1280 | 2.0984 | 0.3189 | 0.2077 | 0.1561 | 0.2062 |
64
- | 2.0093 | 10.9147 | 1408 | 1.9506 | 0.2829 | 0.2089 | 0.1517 | 0.2157 |
65
- | 2.014 | 11.9070 | 1536 | 1.9494 | 0.3039 | 0.2086 | 0.1538 | 0.2152 |
66
- | 2.0137 | 12.8992 | 1664 | 1.9247 | 0.3109 | 0.2110 | 0.1548 | 0.2190 |
67
- | 2.0055 | 13.8915 | 1792 | 1.8977 | 0.3184 | 0.2121 | 0.1537 | 0.2223 |
68
- | 2.0058 | 14.8837 | 1920 | 1.9747 | 0.3245 | 0.2094 | 0.1539 | 0.2130 |
69
- | 1.9975 | 15.8760 | 2048 | 1.9288 | 0.3084 | 0.2109 | 0.1535 | 0.2187 |
70
- | 1.995 | 16.8682 | 2176 | 1.8964 | 0.3036 | 0.2142 | 0.1590 | 0.2247 |
71
- | 1.9959 | 17.8605 | 2304 | 1.9247 | 0.3164 | 0.2144 | 0.1605 | 0.2209 |
72
- | 2.003 | 18.8527 | 2432 | 1.9297 | 0.3152 | 0.2151 | 0.1595 | 0.2217 |
73
- | 1.9908 | 19.8450 | 2560 | 1.8936 | 0.3065 | 0.2144 | 0.1610 | 0.2256 |
74
- | 1.9843 | 20.8372 | 2688 | 1.9238 | 0.3201 | 0.2168 | 0.1613 | 0.2242 |
75
- | 2.0042 | 21.8295 | 2816 | 1.9712 | 0.3228 | 0.2095 | 0.1577 | 0.2119 |
76
- | 1.9913 | 22.8217 | 2944 | 1.9070 | 0.3134 | 0.2168 | 0.1612 | 0.2250 |
77
- | 1.9855 | 23.8140 | 3072 | 1.9155 | 0.3123 | 0.2166 | 0.1611 | 0.2242 |
78
- | 1.9892 | 24.8062 | 3200 | 1.9338 | 0.3213 | 0.2163 | 0.1619 | 0.2220 |
79
- | 1.9964 | 25.7984 | 3328 | 1.9309 | 0.3125 | 0.2167 | 0.1625 | 0.2226 |
80
- | 1.9704 | 26.7907 | 3456 | 1.9165 | 0.3101 | 0.2187 | 0.1648 | 0.2258 |
81
- | 1.9977 | 27.7829 | 3584 | 1.9165 | 0.3177 | 0.2193 | 0.1653 | 0.2264 |
82
- | 1.9976 | 28.7752 | 3712 | 1.9127 | 0.3099 | 0.2191 | 0.1643 | 0.2269 |
83
- | 1.9728 | 29.7674 | 3840 | 1.9129 | 0.3096 | 0.2186 | 0.1640 | 0.2264 |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
84
 
85
 
86
  ### Framework versions
 
1
  ---
2
+ license: apache-2.0
3
  base_model: lemon-mint/LaBSE-EnKo-Nano-Preview-v0.3
4
  tags:
5
  - generated_from_trainer
 
15
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
16
  should probably proofread and complete it, then remove this comment. -->
17
 
 
 
18
  # ko-edu-classifier
19
 
20
  This model is a fine-tuned version of [lemon-mint/LaBSE-EnKo-Nano-Preview-v0.3](https://huggingface.co/lemon-mint/LaBSE-EnKo-Nano-Preview-v0.3) on the None dataset.
21
+ It achieves the following results on the evaluation set:
22
+ - Loss: 0.3773
23
+ - Precision: 0.2713
24
+ - Recall: 0.2016
25
+ - F1 Macro: 0.1949
26
+ - Accuracy: 0.5907
27
 
28
  ## Model description
29
 
 
42
  ### Training hyperparameters
43
 
44
  The following hyperparameters were used during training:
45
+ - learning_rate: 0.0005
46
+ - train_batch_size: 64
47
+ - eval_batch_size: 64
48
  - seed: 0
49
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
50
  - lr_scheduler_type: cosine
51
  - lr_scheduler_warmup_ratio: 0.2
52
+ - num_epochs: 20
53
 
54
  ### Training results
55
 
56
+ | Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1 Macro | Accuracy |
57
+ |:-------------:|:-------:|:-----:|:---------------:|:---------:|:------:|:--------:|:--------:|
58
+ | 3.8465 | 0.0945 | 256 | 3.0189 | 0.0366 | 0.1607 | 0.0341 | 0.0453 |
59
+ | 1.8667 | 0.1890 | 512 | 0.8075 | 0.1370 | 0.1744 | 0.1380 | 0.3775 |
60
+ | 0.5548 | 0.2835 | 768 | 0.4790 | 0.2172 | 0.1802 | 0.1642 | 0.5589 |
61
+ | 0.4568 | 0.3780 | 1024 | 0.4423 | 0.2393 | 0.1822 | 0.1665 | 0.5652 |
62
+ | 0.4347 | 0.4725 | 1280 | 0.4239 | 0.2335 | 0.1842 | 0.1707 | 0.5658 |
63
+ | 0.4217 | 0.5670 | 1536 | 0.4139 | 0.2458 | 0.1912 | 0.1814 | 0.5745 |
64
+ | 0.421 | 0.6615 | 1792 | 0.4068 | 0.2593 | 0.1968 | 0.1896 | 0.5817 |
65
+ | 0.4024 | 0.7560 | 2048 | 0.3991 | 0.2657 | 0.1946 | 0.1856 | 0.5821 |
66
+ | 0.4024 | 0.8505 | 2304 | 0.3964 | 0.2669 | 0.2010 | 0.1954 | 0.5876 |
67
+ | 0.3864 | 0.9450 | 2560 | 0.3937 | 0.2636 | 0.1967 | 0.1891 | 0.5842 |
68
+ | 0.3989 | 1.0395 | 2816 | 0.3927 | 0.2623 | 0.1983 | 0.1916 | 0.5849 |
69
+ | 0.391 | 1.1340 | 3072 | 0.3930 | 0.2617 | 0.1982 | 0.1911 | 0.5842 |
70
+ | 0.3905 | 1.2285 | 3328 | 0.3920 | 0.2677 | 0.1988 | 0.1914 | 0.5876 |
71
+ | 0.3919 | 1.3230 | 3584 | 0.3911 | 0.2591 | 0.1972 | 0.1895 | 0.5824 |
72
+ | 0.3831 | 1.4175 | 3840 | 0.3909 | 0.2699 | 0.2027 | 0.1972 | 0.5883 |
73
+ | 0.3903 | 1.5120 | 4096 | 0.3883 | 0.2663 | 0.2018 | 0.1966 | 0.5866 |
74
+ | 0.389 | 1.6065 | 4352 | 0.3862 | 0.2640 | 0.2024 | 0.1977 | 0.5869 |
75
+ | 0.3913 | 1.7010 | 4608 | 0.3886 | 0.2575 | 0.2018 | 0.1963 | 0.5821 |
76
+ | 0.3913 | 1.7955 | 4864 | 0.3859 | 0.2605 | 0.1958 | 0.1875 | 0.5835 |
77
+ | 0.3783 | 1.8900 | 5120 | 0.3865 | 0.2611 | 0.1968 | 0.1894 | 0.5835 |
78
+ | 0.3854 | 1.9845 | 5376 | 0.3877 | 0.2637 | 0.2025 | 0.1978 | 0.5859 |
79
+ | 0.3816 | 2.0790 | 5632 | 0.3855 | 0.2630 | 0.1981 | 0.1915 | 0.5845 |
80
+ | 0.3798 | 2.1735 | 5888 | 0.3848 | 0.2637 | 0.1989 | 0.1927 | 0.5856 |
81
+ | 0.3807 | 2.2680 | 6144 | 0.3856 | 0.2665 | 0.2008 | 0.1943 | 0.5876 |
82
+ | 0.3886 | 2.3625 | 6400 | 0.3855 | 0.2606 | 0.1943 | 0.1845 | 0.5824 |
83
+ | 0.3744 | 2.4570 | 6656 | 0.3894 | 0.2646 | 0.2045 | 0.2001 | 0.5856 |
84
+ | 0.3884 | 2.5515 | 6912 | 0.3859 | 0.2622 | 0.2017 | 0.1961 | 0.5849 |
85
+ | 0.3888 | 2.6460 | 7168 | 0.3841 | 0.2597 | 0.1939 | 0.1849 | 0.5811 |
86
+ | 0.3768 | 2.7405 | 7424 | 0.3866 | 0.2597 | 0.1975 | 0.1896 | 0.5828 |
87
+ | 0.3851 | 2.8350 | 7680 | 0.3853 | 0.2690 | 0.1963 | 0.1875 | 0.5828 |
88
+ | 0.3846 | 2.9295 | 7936 | 0.3834 | 0.2597 | 0.1939 | 0.1836 | 0.5824 |
89
+ | 0.3814 | 3.0240 | 8192 | 0.3932 | 0.2692 | 0.2004 | 0.1911 | 0.5838 |
90
+ | 0.3811 | 3.1185 | 8448 | 0.3891 | 0.2657 | 0.2018 | 0.1946 | 0.5845 |
91
+ | 0.3773 | 3.2130 | 8704 | 0.3828 | 0.2677 | 0.2023 | 0.1972 | 0.5873 |
92
+ | 0.3711 | 3.3075 | 8960 | 0.3837 | 0.2654 | 0.1998 | 0.1925 | 0.5862 |
93
+ | 0.3863 | 3.4020 | 9216 | 0.3824 | 0.2669 | 0.1969 | 0.1888 | 0.5866 |
94
+ | 0.3884 | 3.4965 | 9472 | 0.3873 | 0.2643 | 0.1984 | 0.1905 | 0.5852 |
95
+ | 0.3828 | 3.5910 | 9728 | 0.3937 | 0.2594 | 0.1993 | 0.1907 | 0.5779 |
96
+ | 0.3794 | 3.6855 | 9984 | 0.3828 | 0.2552 | 0.1928 | 0.1819 | 0.5828 |
97
+ | 0.383 | 3.7800 | 10240 | 0.3829 | 0.2586 | 0.1922 | 0.1812 | 0.5821 |
98
+ | 0.3821 | 3.8745 | 10496 | 0.4009 | 0.2666 | 0.2048 | 0.1943 | 0.5842 |
99
+ | 0.3755 | 3.9690 | 10752 | 0.3818 | 0.2667 | 0.2076 | 0.2070 | 0.5862 |
100
+ | 0.372 | 4.0635 | 11008 | 0.3817 | 0.2647 | 0.1966 | 0.1886 | 0.5852 |
101
+ | 0.3795 | 4.1580 | 11264 | 0.3958 | 0.2719 | 0.2054 | 0.1961 | 0.5894 |
102
+ | 0.3821 | 4.2525 | 11520 | 0.3820 | 0.2626 | 0.1974 | 0.1901 | 0.5845 |
103
+ | 0.3764 | 4.3470 | 11776 | 0.3866 | 0.2665 | 0.2054 | 0.1982 | 0.5887 |
104
+ | 0.3753 | 4.4415 | 12032 | 0.3803 | 0.2582 | 0.1945 | 0.1854 | 0.5821 |
105
+ | 0.3813 | 4.5360 | 12288 | 0.3959 | 0.2721 | 0.2064 | 0.1967 | 0.5869 |
106
+ | 0.3765 | 4.6305 | 12544 | 0.3798 | 0.2684 | 0.2018 | 0.1967 | 0.5890 |
107
+ | 0.3832 | 4.7250 | 12800 | 0.3802 | 0.2596 | 0.1988 | 0.1923 | 0.5828 |
108
+ | 0.3849 | 4.8195 | 13056 | 0.3838 | 0.2528 | 0.1955 | 0.1878 | 0.5741 |
109
+ | 0.3766 | 4.9140 | 13312 | 0.3916 | 0.2588 | 0.1987 | 0.1914 | 0.5752 |
110
+ | 0.3744 | 5.0085 | 13568 | 0.3791 | 0.2689 | 0.1995 | 0.1916 | 0.5883 |
111
+ | 0.3811 | 5.1030 | 13824 | 0.3814 | 0.2630 | 0.1936 | 0.1825 | 0.5835 |
112
+ | 0.3761 | 5.1975 | 14080 | 0.3835 | 0.2656 | 0.1963 | 0.1867 | 0.5849 |
113
+ | 0.3701 | 5.2920 | 14336 | 0.3811 | 0.2611 | 0.1988 | 0.1931 | 0.5831 |
114
+ | 0.3775 | 5.3865 | 14592 | 0.3820 | 0.2585 | 0.1944 | 0.1850 | 0.5828 |
115
+ | 0.3778 | 5.4810 | 14848 | 0.4007 | 0.2653 | 0.2109 | 0.2018 | 0.5849 |
116
+ | 0.3763 | 5.5755 | 15104 | 0.3822 | 0.2672 | 0.1975 | 0.1888 | 0.5869 |
117
+ | 0.3741 | 5.6700 | 15360 | 0.3865 | 0.2671 | 0.2028 | 0.1954 | 0.5869 |
118
+ | 0.3779 | 5.7645 | 15616 | 0.3848 | 0.2661 | 0.1985 | 0.1915 | 0.5745 |
119
+ | 0.3783 | 5.8590 | 15872 | 0.3823 | 0.2643 | 0.1950 | 0.1857 | 0.5831 |
120
+ | 0.3774 | 5.9535 | 16128 | 0.3791 | 0.2707 | 0.2020 | 0.1956 | 0.5911 |
121
+ | 0.3742 | 6.0480 | 16384 | 0.3805 | 0.2643 | 0.2019 | 0.1970 | 0.5862 |
122
+ | 0.3712 | 6.1425 | 16640 | 0.3838 | 0.2690 | 0.1984 | 0.1913 | 0.5811 |
123
+ | 0.3714 | 6.2370 | 16896 | 0.3844 | 0.2713 | 0.2003 | 0.1926 | 0.5835 |
124
+ | 0.3769 | 6.3315 | 17152 | 0.3830 | 0.2731 | 0.1968 | 0.1887 | 0.5828 |
125
+ | 0.3714 | 6.4260 | 17408 | 0.3816 | 0.2651 | 0.1938 | 0.1841 | 0.5797 |
126
+ | 0.3804 | 6.5205 | 17664 | 0.3803 | 0.2610 | 0.1998 | 0.1946 | 0.5835 |
127
+ | 0.3778 | 6.6150 | 17920 | 0.3833 | 0.2676 | 0.2038 | 0.1972 | 0.5869 |
128
+ | 0.3755 | 6.7095 | 18176 | 0.3791 | 0.2740 | 0.1998 | 0.1933 | 0.5900 |
129
+ | 0.3777 | 6.8040 | 18432 | 0.3851 | 0.2680 | 0.2064 | 0.1991 | 0.5876 |
130
+ | 0.3762 | 6.8985 | 18688 | 0.3804 | 0.2705 | 0.2060 | 0.2034 | 0.5914 |
131
+ | 0.3708 | 6.9930 | 18944 | 0.3792 | 0.2639 | 0.1994 | 0.1930 | 0.5869 |
132
+ | 0.3756 | 7.0875 | 19200 | 0.3822 | 0.2626 | 0.2042 | 0.1985 | 0.5852 |
133
+ | 0.3724 | 7.1820 | 19456 | 0.3828 | 0.2601 | 0.1997 | 0.1920 | 0.5828 |
134
+ | 0.374 | 7.2765 | 19712 | 0.3805 | 0.2652 | 0.1948 | 0.1848 | 0.5849 |
135
+ | 0.3691 | 7.3710 | 19968 | 0.4004 | 0.2752 | 0.2007 | 0.1914 | 0.5721 |
136
+ | 0.3721 | 7.4655 | 20224 | 0.3788 | 0.2670 | 0.1989 | 0.1921 | 0.5869 |
137
+ | 0.3757 | 7.5600 | 20480 | 0.3818 | 0.2743 | 0.2001 | 0.1905 | 0.5904 |
138
+ | 0.3699 | 7.6545 | 20736 | 0.3919 | 0.2774 | 0.2048 | 0.1951 | 0.5880 |
139
+ | 0.3705 | 7.7490 | 20992 | 0.3796 | 0.2607 | 0.1972 | 0.1895 | 0.5838 |
140
+ | 0.3783 | 7.8435 | 21248 | 0.3826 | 0.2752 | 0.2070 | 0.2002 | 0.5932 |
141
+ | 0.3806 | 7.9380 | 21504 | 0.3835 | 0.2593 | 0.1991 | 0.1904 | 0.5835 |
142
+ | 0.3828 | 8.0325 | 21760 | 0.3806 | 0.2643 | 0.1965 | 0.1888 | 0.5821 |
143
+ | 0.3761 | 8.1270 | 22016 | 0.3794 | 0.2668 | 0.2026 | 0.1953 | 0.5873 |
144
+ | 0.3811 | 8.2215 | 22272 | 0.3834 | 0.2600 | 0.2025 | 0.1947 | 0.5856 |
145
+ | 0.3667 | 8.3160 | 22528 | 0.3845 | 0.2619 | 0.2009 | 0.1925 | 0.5845 |
146
+ | 0.3697 | 8.4105 | 22784 | 0.3845 | 0.2733 | 0.2026 | 0.1948 | 0.5900 |
147
+ | 0.3689 | 8.5050 | 23040 | 0.3939 | 0.2635 | 0.2009 | 0.1918 | 0.5776 |
148
+ | 0.3682 | 8.5995 | 23296 | 0.3787 | 0.2695 | 0.2004 | 0.1947 | 0.5883 |
149
+ | 0.3717 | 8.6940 | 23552 | 0.3838 | 0.2704 | 0.1955 | 0.1856 | 0.5831 |
150
+ | 0.3696 | 8.7885 | 23808 | 0.3850 | 0.2571 | 0.2010 | 0.1923 | 0.5811 |
151
+ | 0.3713 | 8.8830 | 24064 | 0.3800 | 0.2639 | 0.1969 | 0.1879 | 0.5859 |
152
+ | 0.3736 | 8.9775 | 24320 | 0.3816 | 0.2747 | 0.2037 | 0.1983 | 0.5925 |
153
+ | 0.3745 | 9.0720 | 24576 | 0.3807 | 0.2643 | 0.1986 | 0.1920 | 0.5849 |
154
+ | 0.3653 | 9.1665 | 24832 | 0.3818 | 0.2648 | 0.1944 | 0.1839 | 0.5845 |
155
+ | 0.3748 | 9.2610 | 25088 | 0.3813 | 0.2670 | 0.2090 | 0.2064 | 0.5887 |
156
+ | 0.3715 | 9.3555 | 25344 | 0.3812 | 0.2682 | 0.2019 | 0.1949 | 0.5887 |
157
+ | 0.3716 | 9.4500 | 25600 | 0.3790 | 0.2699 | 0.2021 | 0.1970 | 0.5897 |
158
+ | 0.3757 | 9.5445 | 25856 | 0.3810 | 0.2670 | 0.1968 | 0.1880 | 0.5873 |
159
+ | 0.3648 | 9.6390 | 26112 | 0.3797 | 0.2678 | 0.2010 | 0.1932 | 0.5883 |
160
+ | 0.372 | 9.7335 | 26368 | 0.3788 | 0.2683 | 0.2065 | 0.2020 | 0.5897 |
161
+ | 0.3667 | 9.8280 | 26624 | 0.3785 | 0.2741 | 0.2005 | 0.1950 | 0.5887 |
162
+ | 0.3784 | 9.9225 | 26880 | 0.3822 | 0.2735 | 0.2050 | 0.1979 | 0.5911 |
163
+ | 0.3766 | 10.0170 | 27136 | 0.3951 | 0.2620 | 0.2082 | 0.1990 | 0.5807 |
164
+ | 0.3737 | 10.1115 | 27392 | 0.3781 | 0.2665 | 0.2020 | 0.1968 | 0.5873 |
165
+ | 0.3683 | 10.2060 | 27648 | 0.3764 | 0.2680 | 0.2026 | 0.1984 | 0.5883 |
166
+ | 0.3627 | 10.3005 | 27904 | 0.3807 | 0.2684 | 0.1980 | 0.1903 | 0.5887 |
167
+ | 0.3732 | 10.3950 | 28160 | 0.3788 | 0.2626 | 0.1950 | 0.1862 | 0.5835 |
168
+ | 0.3758 | 10.4895 | 28416 | 0.3830 | 0.2662 | 0.1998 | 0.1916 | 0.5859 |
169
+ | 0.3634 | 10.5840 | 28672 | 0.3829 | 0.2621 | 0.1998 | 0.1911 | 0.5838 |
170
+ | 0.3742 | 10.6785 | 28928 | 0.3842 | 0.2684 | 0.2029 | 0.1982 | 0.5838 |
171
+ | 0.3685 | 10.7730 | 29184 | 0.3768 | 0.2693 | 0.1989 | 0.1921 | 0.5880 |
172
+ | 0.3681 | 10.8675 | 29440 | 0.3770 | 0.2689 | 0.1997 | 0.1935 | 0.5880 |
173
+ | 0.3793 | 10.9620 | 29696 | 0.3789 | 0.2660 | 0.1987 | 0.1899 | 0.5887 |
174
+ | 0.3771 | 11.0565 | 29952 | 0.3825 | 0.2670 | 0.2016 | 0.1931 | 0.5873 |
175
+ | 0.3638 | 11.1510 | 30208 | 0.3840 | 0.2668 | 0.1977 | 0.1901 | 0.5821 |
176
+ | 0.3666 | 11.2455 | 30464 | 0.3836 | 0.2675 | 0.2008 | 0.1920 | 0.5876 |
177
+ | 0.3702 | 11.3400 | 30720 | 0.3793 | 0.2691 | 0.2016 | 0.1970 | 0.5869 |
178
+ | 0.3679 | 11.4345 | 30976 | 0.3796 | 0.2722 | 0.1976 | 0.1898 | 0.5876 |
179
+ | 0.3602 | 11.5290 | 31232 | 0.3814 | 0.2676 | 0.2031 | 0.1957 | 0.5883 |
180
+ | 0.3816 | 11.6235 | 31488 | 0.3825 | 0.2592 | 0.1997 | 0.1915 | 0.5831 |
181
+ | 0.3688 | 11.7180 | 31744 | 0.3790 | 0.2613 | 0.1991 | 0.1929 | 0.5835 |
182
+ | 0.3712 | 11.8125 | 32000 | 0.3795 | 0.2666 | 0.1997 | 0.1910 | 0.5883 |
183
+ | 0.3744 | 11.9070 | 32256 | 0.3778 | 0.2698 | 0.1998 | 0.1922 | 0.5887 |
184
+ | 0.3686 | 12.0015 | 32512 | 0.3775 | 0.2644 | 0.2008 | 0.1941 | 0.5876 |
185
+ | 0.3698 | 12.0960 | 32768 | 0.3813 | 0.2695 | 0.2041 | 0.1966 | 0.5894 |
186
+ | 0.3628 | 12.1905 | 33024 | 0.3772 | 0.2707 | 0.2000 | 0.1933 | 0.5894 |
187
+ | 0.37 | 12.2850 | 33280 | 0.3818 | 0.2692 | 0.2053 | 0.1990 | 0.5897 |
188
+ | 0.3735 | 12.3795 | 33536 | 0.3763 | 0.2662 | 0.2015 | 0.1953 | 0.5880 |
189
+ | 0.3662 | 12.4740 | 33792 | 0.3779 | 0.2704 | 0.2049 | 0.2002 | 0.5907 |
190
+ | 0.3732 | 12.5685 | 34048 | 0.3796 | 0.2665 | 0.2004 | 0.1925 | 0.5880 |
191
+ | 0.364 | 12.6630 | 34304 | 0.3803 | 0.2660 | 0.2016 | 0.1948 | 0.5890 |
192
+ | 0.3722 | 12.7575 | 34560 | 0.3787 | 0.2689 | 0.2046 | 0.2003 | 0.5904 |
193
+ | 0.3701 | 12.8520 | 34816 | 0.3828 | 0.2688 | 0.2050 | 0.1984 | 0.5883 |
194
+ | 0.3697 | 12.9465 | 35072 | 0.3932 | 0.2674 | 0.2088 | 0.1993 | 0.5852 |
195
+ | 0.3671 | 13.0410 | 35328 | 0.3762 | 0.2685 | 0.1992 | 0.1925 | 0.5873 |
196
+ | 0.3641 | 13.1355 | 35584 | 0.3776 | 0.2654 | 0.2006 | 0.1939 | 0.5876 |
197
+ | 0.3707 | 13.2300 | 35840 | 0.3793 | 0.2678 | 0.2022 | 0.1948 | 0.5883 |
198
+ | 0.365 | 13.3245 | 36096 | 0.3771 | 0.2713 | 0.2015 | 0.1960 | 0.5900 |
199
+ | 0.3733 | 13.4190 | 36352 | 0.3775 | 0.2701 | 0.2017 | 0.1970 | 0.5883 |
200
+ | 0.3688 | 13.5135 | 36608 | 0.3881 | 0.2663 | 0.2036 | 0.1945 | 0.5835 |
201
+ | 0.3706 | 13.6080 | 36864 | 0.3805 | 0.2708 | 0.2036 | 0.1971 | 0.5907 |
202
+ | 0.3736 | 13.7025 | 37120 | 0.3778 | 0.2694 | 0.2013 | 0.1964 | 0.5876 |
203
+ | 0.3666 | 13.7970 | 37376 | 0.3802 | 0.2655 | 0.2020 | 0.1947 | 0.5876 |
204
+ | 0.3702 | 13.8915 | 37632 | 0.3791 | 0.2741 | 0.1989 | 0.1911 | 0.5911 |
205
+ | 0.3587 | 13.9860 | 37888 | 0.3804 | 0.2666 | 0.2026 | 0.1962 | 0.5873 |
206
+ | 0.3676 | 14.0805 | 38144 | 0.3867 | 0.2729 | 0.2042 | 0.1953 | 0.5866 |
207
+ | 0.3649 | 14.1750 | 38400 | 0.3778 | 0.2687 | 0.2015 | 0.1955 | 0.5894 |
208
+ | 0.3658 | 14.2695 | 38656 | 0.3778 | 0.2636 | 0.1995 | 0.1938 | 0.5845 |
209
+ | 0.3579 | 14.3640 | 38912 | 0.3796 | 0.2709 | 0.2047 | 0.1988 | 0.5907 |
210
+ | 0.3654 | 14.4585 | 39168 | 0.3783 | 0.2643 | 0.1971 | 0.1881 | 0.5862 |
211
+ | 0.3706 | 14.5530 | 39424 | 0.3769 | 0.2674 | 0.2009 | 0.1951 | 0.5876 |
212
+ | 0.3689 | 14.6475 | 39680 | 0.3779 | 0.2666 | 0.2041 | 0.1985 | 0.5890 |
213
+ | 0.3654 | 14.7420 | 39936 | 0.3776 | 0.2715 | 0.2015 | 0.1952 | 0.5904 |
214
+ | 0.3724 | 14.8365 | 40192 | 0.3791 | 0.2711 | 0.2037 | 0.1982 | 0.5914 |
215
+ | 0.3709 | 14.9310 | 40448 | 0.3778 | 0.2673 | 0.1976 | 0.1902 | 0.5862 |
216
+ | 0.3703 | 15.0255 | 40704 | 0.3803 | 0.2654 | 0.2018 | 0.1940 | 0.5873 |
217
+ | 0.3644 | 15.1200 | 40960 | 0.3776 | 0.2714 | 0.2004 | 0.1940 | 0.5907 |
218
+ | 0.3721 | 15.2145 | 41216 | 0.3791 | 0.2729 | 0.2029 | 0.1977 | 0.5921 |
219
+ | 0.3633 | 15.3090 | 41472 | 0.3769 | 0.2706 | 0.2008 | 0.1948 | 0.5897 |
220
+ | 0.3635 | 15.4035 | 41728 | 0.3769 | 0.2639 | 0.1962 | 0.1870 | 0.5856 |
221
+ | 0.3713 | 15.4980 | 41984 | 0.3803 | 0.2669 | 0.1999 | 0.1909 | 0.5887 |
222
+ | 0.3707 | 15.5925 | 42240 | 0.3837 | 0.2673 | 0.2010 | 0.1912 | 0.5869 |
223
+ | 0.3618 | 15.6870 | 42496 | 0.3818 | 0.2702 | 0.2014 | 0.1921 | 0.5894 |
224
+ | 0.3711 | 15.7815 | 42752 | 0.3770 | 0.2667 | 0.2008 | 0.1944 | 0.5883 |
225
+ | 0.3672 | 15.8760 | 43008 | 0.3778 | 0.2651 | 0.2022 | 0.1962 | 0.5880 |
226
+ | 0.3659 | 15.9705 | 43264 | 0.3772 | 0.2692 | 0.2016 | 0.1953 | 0.5897 |
227
+ | 0.3605 | 16.0650 | 43520 | 0.3804 | 0.2651 | 0.2010 | 0.1927 | 0.5866 |
228
+ | 0.3729 | 16.1595 | 43776 | 0.3774 | 0.2681 | 0.2003 | 0.1946 | 0.5880 |
229
+ | 0.3668 | 16.2540 | 44032 | 0.3774 | 0.2685 | 0.2002 | 0.1921 | 0.5894 |
230
+ | 0.3655 | 16.3485 | 44288 | 0.3763 | 0.2690 | 0.2004 | 0.1947 | 0.5883 |
231
+ | 0.3692 | 16.4430 | 44544 | 0.3772 | 0.2667 | 0.2019 | 0.1953 | 0.5887 |
232
+ | 0.3651 | 16.5375 | 44800 | 0.3771 | 0.2638 | 0.1998 | 0.1931 | 0.5866 |
233
+ | 0.3631 | 16.6320 | 45056 | 0.3791 | 0.2656 | 0.2011 | 0.1938 | 0.5869 |
234
+ | 0.3688 | 16.7265 | 45312 | 0.3772 | 0.2657 | 0.1991 | 0.1920 | 0.5869 |
235
+ | 0.3748 | 16.8210 | 45568 | 0.3782 | 0.2695 | 0.2006 | 0.1934 | 0.5894 |
236
+ | 0.3581 | 16.9155 | 45824 | 0.3772 | 0.2690 | 0.1999 | 0.1929 | 0.5887 |
237
+ | 0.36 | 17.0100 | 46080 | 0.3774 | 0.2716 | 0.2008 | 0.1942 | 0.5900 |
238
+ | 0.3621 | 17.1045 | 46336 | 0.3768 | 0.2683 | 0.2008 | 0.1946 | 0.5887 |
239
+ | 0.3682 | 17.1990 | 46592 | 0.3791 | 0.2676 | 0.2016 | 0.1937 | 0.5894 |
240
+ | 0.3687 | 17.2935 | 46848 | 0.3774 | 0.2734 | 0.2020 | 0.1952 | 0.5918 |
241
+ | 0.3609 | 17.3880 | 47104 | 0.3765 | 0.2686 | 0.1998 | 0.1932 | 0.5887 |
242
+ | 0.369 | 17.4825 | 47360 | 0.3776 | 0.2686 | 0.2007 | 0.1931 | 0.5887 |
243
+ | 0.3709 | 17.5770 | 47616 | 0.3775 | 0.2703 | 0.2009 | 0.1930 | 0.5897 |
244
+ | 0.3684 | 17.6715 | 47872 | 0.3770 | 0.2675 | 0.2004 | 0.1930 | 0.5883 |
245
+ | 0.3612 | 17.7660 | 48128 | 0.3766 | 0.2687 | 0.2012 | 0.1946 | 0.5894 |
246
+ | 0.3634 | 17.8605 | 48384 | 0.3774 | 0.2712 | 0.2017 | 0.1946 | 0.5911 |
247
+ | 0.3695 | 17.9550 | 48640 | 0.3771 | 0.2724 | 0.2018 | 0.1953 | 0.5911 |
248
+ | 0.3578 | 18.0495 | 48896 | 0.3776 | 0.2723 | 0.2028 | 0.1965 | 0.5918 |
249
+ | 0.3635 | 18.1440 | 49152 | 0.3769 | 0.2697 | 0.2016 | 0.1957 | 0.5900 |
250
+ | 0.3652 | 18.2385 | 49408 | 0.3783 | 0.2671 | 0.2022 | 0.1953 | 0.5890 |
251
+ | 0.3734 | 18.3330 | 49664 | 0.3769 | 0.2658 | 0.2002 | 0.1936 | 0.5876 |
252
+ | 0.3665 | 18.4275 | 49920 | 0.3768 | 0.2695 | 0.2014 | 0.1951 | 0.5900 |
253
+ | 0.3704 | 18.5220 | 50176 | 0.3771 | 0.2703 | 0.2020 | 0.1957 | 0.5907 |
254
+ | 0.364 | 18.6165 | 50432 | 0.3770 | 0.2675 | 0.2009 | 0.1940 | 0.5890 |
255
+ | 0.3587 | 18.7110 | 50688 | 0.3768 | 0.2678 | 0.1999 | 0.1928 | 0.5887 |
256
+ | 0.3622 | 18.8055 | 50944 | 0.3773 | 0.2694 | 0.2012 | 0.1943 | 0.5897 |
257
+ | 0.3648 | 18.9000 | 51200 | 0.3773 | 0.2712 | 0.2018 | 0.1953 | 0.5907 |
258
+ | 0.3688 | 18.9945 | 51456 | 0.3773 | 0.2700 | 0.2013 | 0.1945 | 0.5900 |
259
+ | 0.3638 | 19.0890 | 51712 | 0.3775 | 0.2712 | 0.2014 | 0.1946 | 0.5907 |
260
+ | 0.3609 | 19.1835 | 51968 | 0.3773 | 0.2692 | 0.2004 | 0.1933 | 0.5894 |
261
+ | 0.3633 | 19.2780 | 52224 | 0.3772 | 0.2693 | 0.2007 | 0.1937 | 0.5897 |
262
+ | 0.366 | 19.3725 | 52480 | 0.3772 | 0.2682 | 0.2000 | 0.1928 | 0.5890 |
263
+ | 0.365 | 19.4670 | 52736 | 0.3774 | 0.2688 | 0.2003 | 0.1933 | 0.5894 |
264
+ | 0.3689 | 19.5615 | 52992 | 0.3769 | 0.2678 | 0.1998 | 0.1926 | 0.5887 |
265
+ | 0.3663 | 19.6560 | 53248 | 0.3773 | 0.2704 | 0.2014 | 0.1947 | 0.5904 |
266
+ | 0.3712 | 19.7505 | 53504 | 0.3773 | 0.2714 | 0.2015 | 0.1948 | 0.5907 |
267
+ | 0.3616 | 19.8450 | 53760 | 0.3773 | 0.2713 | 0.2016 | 0.1949 | 0.5907 |
268
+ | 0.369 | 19.9395 | 54016 | 0.3773 | 0.2713 | 0.2016 | 0.1949 | 0.5907 |
269
 
270
 
271
  ### Framework versions
final/config.json CHANGED
@@ -1,18 +1,20 @@
1
  {
2
- "_name_or_path": "intfloat/multilingual-e5-small",
3
  "architectures": [
4
  "BertForSequenceClassification"
5
  ],
6
  "attention_probs_dropout_prob": 0.1,
7
- "classifier_dropout": 0.3,
 
 
8
  "hidden_act": "gelu",
9
  "hidden_dropout_prob": 0.0,
10
- "hidden_size": 384,
11
  "id2label": {
12
  "0": "LABEL_0"
13
  },
14
  "initializer_range": 0.02,
15
- "intermediate_size": 1536,
16
  "label2id": {
17
  "LABEL_0": 0
18
  },
@@ -22,12 +24,16 @@
22
  "num_attention_heads": 12,
23
  "num_hidden_layers": 12,
24
  "pad_token_id": 0,
 
 
 
 
 
25
  "position_embedding_type": "absolute",
26
  "problem_type": "regression",
27
- "tokenizer_class": "XLMRobertaTokenizer",
28
  "torch_dtype": "float32",
29
  "transformers_version": "4.43.3",
30
  "type_vocab_size": 2,
31
  "use_cache": true,
32
- "vocab_size": 250037
33
  }
 
1
  {
2
+ "_name_or_path": "lemon-mint/LaBSE-EnKo-Nano-Preview-v0.3",
3
  "architectures": [
4
  "BertForSequenceClassification"
5
  ],
6
  "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": 0.0,
8
+ "directionality": "bidi",
9
+ "gradient_checkpointing": false,
10
  "hidden_act": "gelu",
11
  "hidden_dropout_prob": 0.0,
12
+ "hidden_size": 768,
13
  "id2label": {
14
  "0": "LABEL_0"
15
  },
16
  "initializer_range": 0.02,
17
+ "intermediate_size": 3072,
18
  "label2id": {
19
  "LABEL_0": 0
20
  },
 
24
  "num_attention_heads": 12,
25
  "num_hidden_layers": 12,
26
  "pad_token_id": 0,
27
+ "pooler_fc_size": 768,
28
+ "pooler_num_attention_heads": 12,
29
+ "pooler_num_fc_layers": 3,
30
+ "pooler_size_per_head": 128,
31
+ "pooler_type": "first_token_transform",
32
  "position_embedding_type": "absolute",
33
  "problem_type": "regression",
 
34
  "torch_dtype": "float32",
35
  "transformers_version": "4.43.3",
36
  "type_vocab_size": 2,
37
  "use_cache": true,
38
+ "vocab_size": 51547
39
  }
final/model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:ea2db5d380214c25f5d780371f1b916e6675ca8de761fa1d2a1132b1d0a88a6c
3
- size 472480332
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8162fbc38e6abb0c97df9662521f0724a04d8f86e64ae1c3114f9eef6f9908c1
3
+ size 502544380
final/special_tokens_map.json CHANGED
@@ -1,48 +1,34 @@
1
  {
2
- "bos_token": {
3
- "content": "<s>",
4
- "lstrip": false,
5
- "normalized": false,
6
- "rstrip": false,
7
- "single_word": false
8
- },
9
  "cls_token": {
10
- "content": "<s>",
11
- "lstrip": false,
12
- "normalized": false,
13
- "rstrip": false,
14
- "single_word": false
15
- },
16
- "eos_token": {
17
- "content": "</s>",
18
  "lstrip": false,
19
  "normalized": false,
20
  "rstrip": false,
21
  "single_word": false
22
  },
23
  "mask_token": {
24
- "content": "<mask>",
25
  "lstrip": false,
26
  "normalized": false,
27
  "rstrip": false,
28
  "single_word": false
29
  },
30
  "pad_token": {
31
- "content": "<pad>",
32
  "lstrip": false,
33
  "normalized": false,
34
  "rstrip": false,
35
  "single_word": false
36
  },
37
  "sep_token": {
38
- "content": "</s>",
39
  "lstrip": false,
40
  "normalized": false,
41
  "rstrip": false,
42
  "single_word": false
43
  },
44
  "unk_token": {
45
- "content": "<unk>",
46
  "lstrip": false,
47
  "normalized": false,
48
  "rstrip": false,
 
1
  {
 
 
 
 
 
 
 
2
  "cls_token": {
3
+ "content": "[CLS]",
 
 
 
 
 
 
 
4
  "lstrip": false,
5
  "normalized": false,
6
  "rstrip": false,
7
  "single_word": false
8
  },
9
  "mask_token": {
10
+ "content": "[MASK]",
11
  "lstrip": false,
12
  "normalized": false,
13
  "rstrip": false,
14
  "single_word": false
15
  },
16
  "pad_token": {
17
+ "content": "[PAD]",
18
  "lstrip": false,
19
  "normalized": false,
20
  "rstrip": false,
21
  "single_word": false
22
  },
23
  "sep_token": {
24
+ "content": "[SEP]",
25
  "lstrip": false,
26
  "normalized": false,
27
  "rstrip": false,
28
  "single_word": false
29
  },
30
  "unk_token": {
31
+ "content": "[UNK]",
32
  "lstrip": false,
33
  "normalized": false,
34
  "rstrip": false,
final/tokenizer.json CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:cd98e5698b201ba914efb8c18b6709fa8735ab71dcad8d2b431e52e8bf68d932
3
- size 17082800
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:942f85e5b63faaa0da4d5965666c3879faa324d72f1784a7a36f9ba164665240
3
+ size 1164978
final/tokenizer_config.json CHANGED
@@ -1,7 +1,7 @@
1
  {
2
  "added_tokens_decoder": {
3
  "0": {
4
- "content": "<s>",
5
  "lstrip": false,
6
  "normalized": false,
7
  "rstrip": false,
@@ -9,7 +9,7 @@
9
  "special": true
10
  },
11
  "1": {
12
- "content": "<pad>",
13
  "lstrip": false,
14
  "normalized": false,
15
  "rstrip": false,
@@ -17,7 +17,7 @@
17
  "special": true
18
  },
19
  "2": {
20
- "content": "</s>",
21
  "lstrip": false,
22
  "normalized": false,
23
  "rstrip": false,
@@ -25,15 +25,15 @@
25
  "special": true
26
  },
27
  "3": {
28
- "content": "<unk>",
29
  "lstrip": false,
30
  "normalized": false,
31
  "rstrip": false,
32
  "single_word": false,
33
  "special": true
34
  },
35
- "250001": {
36
- "content": "<mask>",
37
  "lstrip": false,
38
  "normalized": false,
39
  "rstrip": false,
@@ -41,15 +41,18 @@
41
  "special": true
42
  }
43
  },
44
- "bos_token": "<s>",
45
  "clean_up_tokenization_spaces": true,
46
- "cls_token": "<s>",
47
- "eos_token": "</s>",
48
- "mask_token": "<mask>",
 
 
49
  "model_max_length": 512,
50
- "pad_token": "<pad>",
51
- "sep_token": "</s>",
52
- "sp_model_kwargs": {},
53
- "tokenizer_class": "XLMRobertaTokenizer",
54
- "unk_token": "<unk>"
 
 
55
  }
 
1
  {
2
  "added_tokens_decoder": {
3
  "0": {
4
+ "content": "[PAD]",
5
  "lstrip": false,
6
  "normalized": false,
7
  "rstrip": false,
 
9
  "special": true
10
  },
11
  "1": {
12
+ "content": "[UNK]",
13
  "lstrip": false,
14
  "normalized": false,
15
  "rstrip": false,
 
17
  "special": true
18
  },
19
  "2": {
20
+ "content": "[CLS]",
21
  "lstrip": false,
22
  "normalized": false,
23
  "rstrip": false,
 
25
  "special": true
26
  },
27
  "3": {
28
+ "content": "[SEP]",
29
  "lstrip": false,
30
  "normalized": false,
31
  "rstrip": false,
32
  "single_word": false,
33
  "special": true
34
  },
35
+ "4": {
36
+ "content": "[MASK]",
37
  "lstrip": false,
38
  "normalized": false,
39
  "rstrip": false,
 
41
  "special": true
42
  }
43
  },
 
44
  "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": false,
48
+ "full_tokenizer_file": null,
49
+ "mask_token": "[MASK]",
50
  "model_max_length": 512,
51
+ "never_split": null,
52
+ "pad_token": "[PAD]",
53
+ "sep_token": "[SEP]",
54
+ "strip_accents": null,
55
+ "tokenize_chinese_chars": true,
56
+ "tokenizer_class": "BertTokenizer",
57
+ "unk_token": "[UNK]"
58
  }
final/training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:a59a6f7928d1b4f27b7ed498b16c59aa39b052f2b2320a8d875ac0f82372de7e
3
  size 5176
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:882520aa2973aae4499f83ada418ae35032cf7e32e286b04afe51bd93d269eee
3
  size 5176
final/vocab.txt CHANGED
The diff for this file is too large to render. See raw diff
 
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:91ba29317a107a8595f1cae51cc6af3e9f1e809ac43133a014268eb90d28de88
3
  size 502544380
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8162fbc38e6abb0c97df9662521f0724a04d8f86e64ae1c3114f9eef6f9908c1
3
  size 502544380