R-facebook-bart-base-full-ft-without-tum-nlp-german-gpt2_easy-prior-pp-no_ls-f135
This model is a fine-tuned version of facebook/bart-base on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 4.2584
- Sacrebleu: 8.2960
- Bleu: 0.0830
- Rouge1: 0.2929
- Rouge2: 0.0997
- Rougel: 0.2048
- Sari: 39.1931
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 2
- eval_batch_size: 1
- seed: 42
- gradient_accumulation_steps: 8
- total_train_batch_size: 16
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 100
- num_epochs: 15
- mixed_precision_training: Native AMP
- label_smoothing_factor: 0.1
Training results
Training Loss | Epoch | Step | Validation Loss | Sacrebleu | Bleu | Rouge1 | Rouge2 | Rougel | Sari |
---|---|---|---|---|---|---|---|---|---|
2.3838 | 0.12 | 100 | 4.1901 | 2.7687 | 0.0277 | 0.2195 | 0.0672 | 0.1600 | 36.8973 |
2.2981 | 0.25 | 200 | 4.0797 | 2.0475 | 0.0205 | 0.2190 | 0.0703 | 0.1660 | 37.7972 |
2.2176 | 0.37 | 300 | 4.1482 | 3.1045 | 0.0310 | 0.2389 | 0.0772 | 0.1771 | 37.4184 |
2.1516 | 0.5 | 400 | 4.0546 | 3.1815 | 0.0318 | 0.2417 | 0.0794 | 0.1802 | 37.7797 |
2.1023 | 0.62 | 500 | 4.0191 | 2.7271 | 0.0273 | 0.2312 | 0.0746 | 0.1750 | 37.7930 |
2.1247 | 0.75 | 600 | 3.9677 | 2.5983 | 0.0260 | 0.2356 | 0.0763 | 0.1727 | 37.9143 |
2.0458 | 0.87 | 700 | 4.0814 | 2.3494 | 0.0235 | 0.2220 | 0.0732 | 0.1677 | 37.3185 |
2.0463 | 0.99 | 800 | 4.0572 | 3.7528 | 0.0375 | 0.2566 | 0.0876 | 0.1882 | 38.3515 |
1.9574 | 1.12 | 900 | 3.9546 | 4.5851 | 0.0459 | 0.2640 | 0.0823 | 0.1870 | 38.1778 |
1.921 | 1.24 | 1000 | 4.0235 | 4.4996 | 0.0450 | 0.2615 | 0.0879 | 0.1917 | 38.3405 |
1.9052 | 1.37 | 1100 | 4.0168 | 4.9832 | 0.0498 | 0.2804 | 0.0956 | 0.2023 | 38.2440 |
1.9286 | 1.49 | 1200 | 4.0049 | 4.9955 | 0.0500 | 0.2635 | 0.0868 | 0.1835 | 38.2221 |
1.9191 | 1.62 | 1300 | 3.9732 | 4.1180 | 0.0412 | 0.2568 | 0.0789 | 0.1825 | 37.8479 |
1.85 | 1.74 | 1400 | 4.0051 | 4.5305 | 0.0453 | 0.2567 | 0.0835 | 0.1855 | 38.2152 |
1.8769 | 1.87 | 1500 | 3.9861 | 5.0763 | 0.0508 | 0.2611 | 0.0850 | 0.1880 | 37.9740 |
1.8972 | 1.99 | 1600 | 4.0281 | 4.8333 | 0.0483 | 0.2573 | 0.0878 | 0.1905 | 38.2159 |
1.7643 | 2.11 | 1700 | 4.0955 | 5.3967 | 0.0540 | 0.2648 | 0.0883 | 0.1859 | 37.8259 |
1.7762 | 2.24 | 1800 | 4.0478 | 4.9561 | 0.0496 | 0.2649 | 0.0857 | 0.1904 | 37.9307 |
1.783 | 2.36 | 1900 | 4.0079 | 5.3380 | 0.0534 | 0.2763 | 0.0893 | 0.1928 | 38.3892 |
1.7744 | 2.49 | 2000 | 4.0219 | 5.6769 | 0.0568 | 0.2768 | 0.0926 | 0.2031 | 38.0914 |
1.7641 | 2.61 | 2100 | 3.9933 | 5.1400 | 0.0514 | 0.2696 | 0.0839 | 0.1944 | 37.8093 |
1.7682 | 2.74 | 2200 | 4.0418 | 4.7739 | 0.0477 | 0.2656 | 0.0840 | 0.1905 | 37.9627 |
1.7778 | 2.86 | 2300 | 4.0027 | 5.4326 | 0.0543 | 0.2738 | 0.0921 | 0.1945 | 38.0805 |
1.7106 | 2.98 | 2400 | 4.0066 | 6.2237 | 0.0622 | 0.2798 | 0.1018 | 0.2028 | 38.8314 |
1.7087 | 3.11 | 2500 | 4.0495 | 6.2109 | 0.0621 | 0.2855 | 0.0963 | 0.2029 | 38.4817 |
1.7253 | 3.23 | 2600 | 4.0248 | 5.3354 | 0.0534 | 0.2873 | 0.0957 | 0.1982 | 38.7256 |
1.7143 | 3.36 | 2700 | 3.9905 | 5.6144 | 0.0561 | 0.2743 | 0.0935 | 0.1959 | 38.6462 |
1.7731 | 3.48 | 2800 | 3.9773 | 5.0439 | 0.0504 | 0.2743 | 0.0878 | 0.1946 | 38.8186 |
1.6946 | 3.61 | 2900 | 4.0200 | 5.5291 | 0.0553 | 0.2818 | 0.0928 | 0.1960 | 38.3806 |
1.7104 | 3.73 | 3000 | 4.0039 | 5.7966 | 0.0580 | 0.2797 | 0.0942 | 0.1942 | 38.4275 |
1.7429 | 3.85 | 3100 | 3.9536 | 5.4509 | 0.0545 | 0.2708 | 0.0906 | 0.1940 | 38.4027 |
1.6642 | 3.98 | 3200 | 3.9716 | 5.5049 | 0.0550 | 0.2725 | 0.0884 | 0.1934 | 38.5143 |
1.6227 | 4.1 | 3300 | 4.0434 | 5.6225 | 0.0562 | 0.2876 | 0.0952 | 0.2023 | 38.6488 |
1.6334 | 4.23 | 3400 | 4.0302 | 6.1075 | 0.0611 | 0.2823 | 0.0934 | 0.1984 | 38.3430 |
1.604 | 4.35 | 3500 | 4.0565 | 5.4071 | 0.0541 | 0.2762 | 0.0898 | 0.1928 | 37.9436 |
1.6126 | 4.48 | 3600 | 4.0730 | 5.4640 | 0.0546 | 0.2717 | 0.0879 | 0.1953 | 38.0136 |
1.6703 | 4.6 | 3700 | 4.0610 | 5.9317 | 0.0593 | 0.2841 | 0.0906 | 0.1987 | 38.3703 |
1.6476 | 4.72 | 3800 | 4.0361 | 5.7700 | 0.0577 | 0.2764 | 0.0857 | 0.1917 | 38.3045 |
1.6838 | 4.85 | 3900 | 4.0013 | 6.2475 | 0.0625 | 0.2899 | 0.0950 | 0.2031 | 38.7013 |
1.6498 | 4.97 | 4000 | 4.0097 | 5.8688 | 0.0587 | 0.2804 | 0.0897 | 0.1953 | 38.5862 |
1.6005 | 5.1 | 4100 | 4.0600 | 6.3918 | 0.0639 | 0.2958 | 0.0942 | 0.2028 | 38.6827 |
1.6064 | 5.22 | 4200 | 4.0780 | 6.8747 | 0.0687 | 0.2907 | 0.0956 | 0.2022 | 38.3931 |
1.5612 | 5.35 | 4300 | 4.0645 | 6.2556 | 0.0626 | 0.2792 | 0.0867 | 0.1950 | 38.2156 |
1.5775 | 5.47 | 4400 | 4.0382 | 6.4081 | 0.0641 | 0.2922 | 0.0980 | 0.2053 | 38.7928 |
1.619 | 5.6 | 4500 | 4.0033 | 6.0250 | 0.0603 | 0.2866 | 0.0884 | 0.1997 | 38.2987 |
1.6027 | 5.72 | 4600 | 4.0215 | 7.0061 | 0.0701 | 0.2816 | 0.0960 | 0.1973 | 38.5188 |
1.5837 | 5.84 | 4700 | 4.0735 | 6.6794 | 0.0668 | 0.2846 | 0.0953 | 0.1983 | 38.3129 |
1.5743 | 5.97 | 4800 | 4.0566 | 6.9267 | 0.0693 | 0.2791 | 0.0920 | 0.1944 | 38.5447 |
1.5427 | 6.09 | 4900 | 4.0553 | 6.5612 | 0.0656 | 0.2861 | 0.0946 | 0.2002 | 38.7175 |
1.554 | 6.22 | 5000 | 4.0995 | 7.5212 | 0.0752 | 0.2916 | 0.1022 | 0.2043 | 38.6034 |
1.5205 | 6.34 | 5100 | 4.0716 | 7.3604 | 0.0736 | 0.2975 | 0.1032 | 0.2077 | 38.6330 |
1.5357 | 6.47 | 5200 | 4.0734 | 7.0090 | 0.0701 | 0.2834 | 0.0918 | 0.1937 | 38.3315 |
1.5401 | 6.59 | 5300 | 4.0569 | 7.2066 | 0.0721 | 0.2984 | 0.1007 | 0.2089 | 38.7153 |
1.5533 | 6.71 | 5400 | 4.0381 | 8.2701 | 0.0827 | 0.2942 | 0.1012 | 0.2048 | 38.9153 |
1.5758 | 6.84 | 5500 | 4.0514 | 7.7094 | 0.0771 | 0.2909 | 0.0976 | 0.2032 | 38.7672 |
1.5517 | 6.96 | 5600 | 4.0227 | 7.1626 | 0.0716 | 0.2859 | 0.0946 | 0.2013 | 38.9612 |
1.583 | 7.09 | 5700 | 4.0696 | 7.3099 | 0.0731 | 0.3068 | 0.1040 | 0.2079 | 38.9724 |
1.5426 | 7.21 | 5800 | 4.0742 | 7.7215 | 0.0772 | 0.2912 | 0.0993 | 0.1982 | 38.6200 |
1.5312 | 7.34 | 5900 | 4.0981 | 7.4710 | 0.0747 | 0.2918 | 0.1007 | 0.2005 | 38.6598 |
1.5297 | 7.46 | 6000 | 4.0783 | 8.1777 | 0.0818 | 0.3014 | 0.1051 | 0.2091 | 39.1750 |
1.5507 | 7.58 | 6100 | 4.0805 | 8.7263 | 0.0873 | 0.3077 | 0.1062 | 0.2123 | 39.0997 |
1.5468 | 7.71 | 6200 | 4.0709 | 7.3451 | 0.0735 | 0.2881 | 0.0979 | 0.2034 | 38.5349 |
1.5329 | 7.83 | 6300 | 4.0625 | 8.1881 | 0.0819 | 0.2976 | 0.1023 | 0.2056 | 38.9322 |
1.5859 | 7.96 | 6400 | 4.0743 | 8.3942 | 0.0839 | 0.2952 | 0.1048 | 0.2118 | 39.0793 |
1.4119 | 8.08 | 6500 | 4.0952 | 7.5693 | 0.0757 | 0.3094 | 0.1097 | 0.2182 | 39.0750 |
1.4344 | 8.21 | 6600 | 4.1497 | 8.8624 | 0.0886 | 0.3005 | 0.1041 | 0.2103 | 38.9099 |
1.4668 | 8.33 | 6700 | 4.1204 | 7.9935 | 0.0799 | 0.2987 | 0.1012 | 0.2060 | 39.0226 |
1.4787 | 8.46 | 6800 | 4.1036 | 8.2780 | 0.0828 | 0.2978 | 0.1037 | 0.2081 | 38.8047 |
1.4639 | 8.58 | 6900 | 4.0993 | 7.8695 | 0.0787 | 0.2927 | 0.0983 | 0.2009 | 38.6767 |
1.4997 | 8.7 | 7000 | 4.0572 | 7.8299 | 0.0783 | 0.2897 | 0.0968 | 0.2026 | 38.6392 |
1.4656 | 8.83 | 7100 | 4.1112 | 7.5026 | 0.0750 | 0.3045 | 0.1052 | 0.2100 | 39.1639 |
1.4423 | 8.95 | 7200 | 4.1133 | 7.3459 | 0.0735 | 0.2999 | 0.1034 | 0.2076 | 38.9450 |
1.3401 | 9.08 | 7300 | 4.1719 | 7.9625 | 0.0796 | 0.2916 | 0.0989 | 0.2036 | 38.8932 |
1.3586 | 9.2 | 7400 | 4.1550 | 7.5577 | 0.0756 | 0.2964 | 0.1003 | 0.2079 | 39.0236 |
1.3459 | 9.33 | 7500 | 4.1359 | 7.2886 | 0.0729 | 0.2941 | 0.0948 | 0.2013 | 39.0120 |
1.3972 | 9.45 | 7600 | 4.1412 | 7.2976 | 0.0730 | 0.2821 | 0.0943 | 0.2019 | 38.9448 |
1.4024 | 9.57 | 7700 | 4.1360 | 6.9379 | 0.0694 | 0.2891 | 0.0925 | 0.1978 | 38.9563 |
1.3936 | 9.7 | 7800 | 4.1180 | 7.4721 | 0.0747 | 0.2932 | 0.0979 | 0.2033 | 39.0185 |
1.3813 | 9.82 | 7900 | 4.1485 | 7.9716 | 0.0797 | 0.2933 | 0.1026 | 0.2060 | 39.2937 |
1.3519 | 9.95 | 8000 | 4.1221 | 7.9693 | 0.0797 | 0.2973 | 0.1031 | 0.2090 | 39.4926 |
1.2558 | 10.07 | 8100 | 4.2222 | 6.8651 | 0.0687 | 0.2855 | 0.1000 | 0.2060 | 38.9237 |
1.2456 | 10.2 | 8200 | 4.1953 | 6.7560 | 0.0676 | 0.2788 | 0.0918 | 0.2002 | 38.6121 |
1.2781 | 10.32 | 8300 | 4.2009 | 6.8235 | 0.0682 | 0.2826 | 0.0967 | 0.2042 | 39.2030 |
1.27 | 10.44 | 8400 | 4.2159 | 7.2854 | 0.0729 | 0.2774 | 0.0929 | 0.1976 | 39.0060 |
1.3036 | 10.57 | 8500 | 4.2087 | 6.3116 | 0.0631 | 0.2827 | 0.0940 | 0.2010 | 39.1980 |
1.2934 | 10.69 | 8600 | 4.2011 | 7.4083 | 0.0741 | 0.2880 | 0.0951 | 0.2028 | 39.0879 |
1.2928 | 10.82 | 8700 | 4.1859 | 7.4265 | 0.0743 | 0.2830 | 0.0996 | 0.2030 | 38.8993 |
1.2935 | 10.94 | 8800 | 4.1976 | 8.2571 | 0.0826 | 0.2984 | 0.1071 | 0.2190 | 39.5344 |
1.1764 | 11.07 | 8900 | 4.2697 | 7.0769 | 0.0708 | 0.2776 | 0.0946 | 0.1968 | 39.0592 |
1.2216 | 11.19 | 9000 | 4.2470 | 6.8849 | 0.0688 | 0.2821 | 0.0938 | 0.2009 | 39.0743 |
1.2152 | 11.31 | 9100 | 4.2621 | 7.8078 | 0.0781 | 0.2912 | 0.0986 | 0.2051 | 39.2673 |
1.2263 | 11.44 | 9200 | 4.2377 | 8.0541 | 0.0805 | 0.2850 | 0.1039 | 0.2068 | 39.0468 |
1.1959 | 11.56 | 9300 | 4.2244 | 7.6790 | 0.0768 | 0.2886 | 0.0993 | 0.2064 | 39.0468 |
1.1951 | 11.69 | 9400 | 4.2357 | 7.4380 | 0.0744 | 0.2952 | 0.1020 | 0.2080 | 39.2009 |
1.2181 | 11.81 | 9500 | 4.2293 | 7.6378 | 0.0764 | 0.2929 | 0.1026 | 0.2079 | 39.2786 |
1.2182 | 11.94 | 9600 | 4.2261 | 7.3868 | 0.0739 | 0.2886 | 0.0999 | 0.2079 | 39.2408 |
1.1386 | 12.06 | 9700 | 4.2615 | 7.3600 | 0.0736 | 0.2842 | 0.0936 | 0.2011 | 38.6407 |
1.1219 | 12.19 | 9800 | 4.2410 | 8.2778 | 0.0828 | 0.2905 | 0.1010 | 0.2083 | 39.5071 |
1.1763 | 12.31 | 9900 | 4.2356 | 7.7087 | 0.0771 | 0.2894 | 0.1001 | 0.2038 | 39.0565 |
1.1723 | 12.43 | 10000 | 4.2308 | 7.1490 | 0.0715 | 0.2823 | 0.0939 | 0.2036 | 39.1788 |
1.1212 | 12.56 | 10100 | 4.2457 | 7.7867 | 0.0779 | 0.2901 | 0.1016 | 0.2031 | 39.4189 |
1.1285 | 12.68 | 10200 | 4.2474 | 7.6008 | 0.0760 | 0.2886 | 0.0973 | 0.2034 | 38.9518 |
1.14 | 12.81 | 10300 | 4.2269 | 7.3776 | 0.0738 | 0.2864 | 0.0940 | 0.1995 | 38.9967 |
1.1698 | 12.93 | 10400 | 4.2179 | 7.7488 | 0.0775 | 0.2934 | 0.0989 | 0.2049 | 39.2568 |
1.111 | 13.06 | 10500 | 4.2544 | 7.6406 | 0.0764 | 0.2979 | 0.1009 | 0.2075 | 39.1464 |
1.134 | 13.18 | 10600 | 4.2493 | 7.5843 | 0.0758 | 0.2914 | 0.0977 | 0.2030 | 38.7354 |
1.1309 | 13.3 | 10700 | 4.2578 | 7.7002 | 0.0770 | 0.2910 | 0.0979 | 0.2042 | 39.1543 |
1.1817 | 13.43 | 10800 | 4.2485 | 7.7934 | 0.0779 | 0.2950 | 0.0989 | 0.2071 | 38.9693 |
1.1296 | 13.55 | 10900 | 4.2536 | 7.3443 | 0.0734 | 0.2897 | 0.0947 | 0.2027 | 38.5840 |
1.1457 | 13.68 | 11000 | 4.2430 | 7.2824 | 0.0728 | 0.2844 | 0.0927 | 0.1989 | 38.5460 |
1.169 | 13.8 | 11100 | 4.2319 | 7.6855 | 0.0769 | 0.2926 | 0.0966 | 0.2021 | 38.8614 |
1.1712 | 13.93 | 11200 | 4.2432 | 7.5547 | 0.0755 | 0.2880 | 0.0958 | 0.2008 | 38.7928 |
1.1777 | 14.05 | 11300 | 4.2374 | 8.0068 | 0.0801 | 0.2920 | 0.0987 | 0.2064 | 39.2165 |
1.1784 | 14.17 | 11400 | 4.2686 | 8.0437 | 0.0804 | 0.2938 | 0.1005 | 0.2077 | 39.1538 |
1.1555 | 14.3 | 11500 | 4.2601 | 7.6743 | 0.0767 | 0.2867 | 0.0963 | 0.2004 | 39.1612 |
1.1849 | 14.42 | 11600 | 4.2531 | 7.3441 | 0.0734 | 0.2861 | 0.0924 | 0.1975 | 38.8140 |
1.2111 | 14.55 | 11700 | 4.2460 | 7.9645 | 0.0796 | 0.2888 | 0.0966 | 0.2035 | 38.9464 |
1.1611 | 14.67 | 11800 | 4.2580 | 8.1329 | 0.0813 | 0.2898 | 0.0979 | 0.2041 | 38.9930 |
1.1866 | 14.8 | 11900 | 4.2536 | 8.1866 | 0.0819 | 0.2936 | 0.0990 | 0.2050 | 39.1494 |
1.1876 | 14.92 | 12000 | 4.2584 | 8.2960 | 0.0830 | 0.2929 | 0.0997 | 0.2048 | 39.1931 |
Framework versions
- Transformers 4.29.2
- Pytorch 2.0.0+cu117
- Datasets 2.12.0
- Tokenizers 0.13.3
- Downloads last month
- 10
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.