baseline_sim
This model is a fine-tuned version of facebook/mms-1b-all on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.2816
- Wer: 0.4111
- Bleu: 0.4443
- Rouge: {'rouge1': 0.5431378694774602, 'rouge2': 0.46319971488463374, 'rougeL': 0.543319325066259, 'rougeLsum': 0.5433415661075893}
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.001
- train_batch_size: 4
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 8
- total_train_batch_size: 32
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 100
- num_epochs: 100
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Wer | Bleu | Rouge |
---|---|---|---|---|---|---|
2.1176 | 1.0 | 316 | 0.4477 | 0.5727 | 0.2807 | {'rouge1': 0.4701052724367798, 'rouge2': 0.3611529347993484, 'rougeL': 0.4691794826862792, 'rougeLsum': 0.4696052978313858} |
0.602 | 2.0 | 632 | 0.3814 | 0.5405 | 0.3115 | {'rouge1': 0.4776896230695047, 'rouge2': 0.3698515651670966, 'rougeL': 0.47769636639646806, 'rougeLsum': 0.47713757408112845} |
0.5628 | 3.0 | 948 | 0.3781 | 0.5454 | 0.3022 | {'rouge1': 0.48165162767814085, 'rouge2': 0.377016908076784, 'rougeL': 0.48139639759906566, 'rougeLsum': 0.4809234187863734} |
0.5369 | 4.0 | 1264 | 0.3828 | 0.5438 | 0.3298 | {'rouge1': 0.4799128679538609, 'rouge2': 0.37693506986258707, 'rougeL': 0.47900138317043967, 'rougeLsum': 0.4793741993097963} |
0.5164 | 5.0 | 1580 | 0.3567 | 0.5279 | 0.3232 | {'rouge1': 0.4841810746249126, 'rouge2': 0.3799552710722608, 'rougeL': 0.48354014412182156, 'rougeLsum': 0.4837832822948419} |
0.5051 | 6.0 | 1896 | 0.3456 | 0.5034 | 0.3388 | {'rouge1': 0.4924306786696943, 'rouge2': 0.3935040295524592, 'rougeL': 0.4922117663849014, 'rougeLsum': 0.49201510795127246} |
0.4927 | 7.0 | 2212 | 0.3493 | 0.5045 | 0.3382 | {'rouge1': 0.481375776908068, 'rouge2': 0.3787311988901129, 'rougeL': 0.4814765220458964, 'rougeLsum': 0.4809969574804486} |
0.4832 | 8.0 | 2528 | 0.3303 | 0.4996 | 0.3445 | {'rouge1': 0.4946803662569613, 'rouge2': 0.3943683867902757, 'rougeL': 0.4946726262643269, 'rougeLsum': 0.4942021611683006} |
0.473 | 9.0 | 2844 | 0.3151 | 0.4942 | 0.3514 | {'rouge1': 0.49671763976588457, 'rouge2': 0.3996629185182432, 'rougeL': 0.4965625704625888, 'rougeLsum': 0.4966950774550053} |
0.4657 | 10.0 | 3160 | 0.3257 | 0.5028 | 0.3391 | {'rouge1': 0.49622226435779726, 'rouge2': 0.39534933906387604, 'rougeL': 0.495871123456577, 'rougeLsum': 0.49539946997954604} |
0.4582 | 11.0 | 3476 | 0.3315 | 0.5046 | 0.3428 | {'rouge1': 0.504731063678411, 'rouge2': 0.405773684536023, 'rougeL': 0.5047963279635649, 'rougeLsum': 0.5046296588492237} |
0.4487 | 12.0 | 3792 | 0.3150 | 0.5126 | 0.3508 | {'rouge1': 0.5036259628111639, 'rouge2': 0.405868420995181, 'rougeL': 0.5037565267864342, 'rougeLsum': 0.5037138735732095} |
0.4396 | 13.0 | 4108 | 0.3273 | 0.5028 | 0.3387 | {'rouge1': 0.4964672848959724, 'rouge2': 0.3963505715751717, 'rougeL': 0.4963496083100021, 'rougeLsum': 0.4965166016782854} |
0.4332 | 14.0 | 4424 | 0.3081 | 0.4974 | 0.3664 | {'rouge1': 0.5044876388028294, 'rouge2': 0.4094930750944624, 'rougeL': 0.5039906295330501, 'rougeLsum': 0.5040206457239715} |
0.434 | 15.0 | 4740 | 0.3221 | 0.5141 | 0.3525 | {'rouge1': 0.5125316853503548, 'rouge2': 0.41730706374512155, 'rougeL': 0.5124824721377286, 'rougeLsum': 0.5121343836452501} |
0.4216 | 16.0 | 5056 | 0.3077 | 0.4797 | 0.3680 | {'rouge1': 0.5026941615068741, 'rouge2': 0.4068122664788285, 'rougeL': 0.5023756815131657, 'rougeLsum': 0.502476875893668} |
0.4197 | 17.0 | 5372 | 0.3211 | 0.5029 | 0.3512 | {'rouge1': 0.5042143560595778, 'rouge2': 0.4075605298525744, 'rougeL': 0.504339551075053, 'rougeLsum': 0.5036754244293081} |
0.4151 | 18.0 | 5688 | 0.3083 | 0.4852 | 0.3652 | {'rouge1': 0.5052730002189119, 'rouge2': 0.4085356600159926, 'rougeL': 0.504541614028222, 'rougeLsum': 0.5050471863733861} |
0.4102 | 19.0 | 6004 | 0.3056 | 0.4853 | 0.3608 | {'rouge1': 0.5103746863007577, 'rouge2': 0.41713234947961464, 'rougeL': 0.5102496875412461, 'rougeLsum': 0.5105914668816096} |
0.4065 | 20.0 | 6320 | 0.3060 | 0.4839 | 0.3660 | {'rouge1': 0.5066886248379499, 'rouge2': 0.41147147644412757, 'rougeL': 0.5062731205155445, 'rougeLsum': 0.5070894969158737} |
0.3967 | 21.0 | 6636 | 0.2942 | 0.4668 | 0.3789 | {'rouge1': 0.5164350415868035, 'rouge2': 0.4220216839870954, 'rougeL': 0.5157510807347886, 'rougeLsum': 0.5159435077481445} |
0.3894 | 22.0 | 6952 | 0.3059 | 0.4761 | 0.3680 | {'rouge1': 0.5202884008337207, 'rouge2': 0.426535380001694, 'rougeL': 0.5206388797087906, 'rougeLsum': 0.5201838917014147} |
0.3894 | 23.0 | 7268 | 0.3161 | 0.4725 | 0.3731 | {'rouge1': 0.5170556350779272, 'rouge2': 0.4231271450499444, 'rougeL': 0.5168371609085032, 'rougeLsum': 0.5169744637718943} |
0.3805 | 24.0 | 7584 | 0.3009 | 0.4732 | 0.3750 | {'rouge1': 0.5178798159838369, 'rouge2': 0.42420807830308693, 'rougeL': 0.5180429072561925, 'rougeLsum': 0.5171848326036026} |
0.3771 | 25.0 | 7900 | 0.2973 | 0.4628 | 0.3875 | {'rouge1': 0.509987472688173, 'rouge2': 0.41428786280897695, 'rougeL': 0.509655390996804, 'rougeLsum': 0.5092013564139832} |
0.3725 | 26.0 | 8216 | 0.2909 | 0.4683 | 0.3812 | {'rouge1': 0.5195869162050247, 'rouge2': 0.42739115664150606, 'rougeL': 0.5193552654170152, 'rougeLsum': 0.5193362883002752} |
0.3661 | 27.0 | 8532 | 0.2919 | 0.4746 | 0.3797 | {'rouge1': 0.5202540587146447, 'rouge2': 0.42611553477197534, 'rougeL': 0.51976188567395, 'rougeLsum': 0.5195430592632275} |
0.3651 | 28.0 | 8848 | 0.2964 | 0.4673 | 0.3816 | {'rouge1': 0.5159625646618995, 'rouge2': 0.421718677430831, 'rougeL': 0.5161539617871874, 'rougeLsum': 0.5149555250925026} |
0.361 | 29.0 | 9164 | 0.3011 | 0.4685 | 0.3871 | {'rouge1': 0.5172066961977984, 'rouge2': 0.42508057717961883, 'rougeL': 0.5176461585751073, 'rougeLsum': 0.5166992656813065} |
0.3537 | 30.0 | 9480 | 0.3022 | 0.4702 | 0.3809 | {'rouge1': 0.5142183010804419, 'rouge2': 0.4212281072788994, 'rougeL': 0.513672623868419, 'rougeLsum': 0.5134193270515841} |
0.35 | 31.0 | 9796 | 0.2891 | 0.4574 | 0.3929 | {'rouge1': 0.5229580414263679, 'rouge2': 0.4346077325386656, 'rougeL': 0.5231434277189111, 'rougeLsum': 0.5231289950159839} |
0.3497 | 32.0 | 10112 | 0.3190 | 0.4756 | 0.3779 | {'rouge1': 0.5240420964547278, 'rouge2': 0.43225173940743766, 'rougeL': 0.5240484200825326, 'rougeLsum': 0.5234499322159889} |
0.3467 | 33.0 | 10428 | 0.3029 | 0.4714 | 0.3846 | {'rouge1': 0.5247309726491365, 'rouge2': 0.43484716342491325, 'rougeL': 0.523668806551089, 'rougeLsum': 0.5236143671656388} |
0.3435 | 34.0 | 10744 | 0.3015 | 0.4644 | 0.3933 | {'rouge1': 0.5220928802347181, 'rouge2': 0.4323880918345855, 'rougeL': 0.5218215283702583, 'rougeLsum': 0.5215037571207151} |
0.3383 | 35.0 | 11060 | 0.2958 | 0.4671 | 0.3884 | {'rouge1': 0.5201593846029072, 'rouge2': 0.4303305500919477, 'rougeL': 0.5202170582834349, 'rougeLsum': 0.5204392868791524} |
0.3378 | 36.0 | 11376 | 0.2865 | 0.4505 | 0.3995 | {'rouge1': 0.5254980092622998, 'rouge2': 0.43868249419528177, 'rougeL': 0.5259994525174937, 'rougeLsum': 0.5255468410663021} |
0.3339 | 37.0 | 11692 | 0.3019 | 0.4649 | 0.3864 | {'rouge1': 0.5289082934497475, 'rouge2': 0.4403604394064534, 'rougeL': 0.529104255127312, 'rougeLsum': 0.5289829740684285} |
0.33 | 38.0 | 12008 | 0.2887 | 0.4553 | 0.3959 | {'rouge1': 0.5241924754733059, 'rouge2': 0.4337710132156277, 'rougeL': 0.5245062958600637, 'rougeLsum': 0.5237140450300832} |
0.3252 | 39.0 | 12324 | 0.2923 | 0.4487 | 0.3982 | {'rouge1': 0.5282486990597959, 'rouge2': 0.43712499012325473, 'rougeL': 0.5281320402300653, 'rougeLsum': 0.5282969843722276} |
0.3231 | 40.0 | 12640 | 0.2895 | 0.4526 | 0.3990 | {'rouge1': 0.5180806375921845, 'rouge2': 0.4274962499823852, 'rougeL': 0.5174285655272781, 'rougeLsum': 0.5180470787161557} |
0.3178 | 41.0 | 12956 | 0.2827 | 0.4456 | 0.4046 | {'rouge1': 0.5288551004286756, 'rouge2': 0.43937417432884296, 'rougeL': 0.5289680502478359, 'rougeLsum': 0.528493928899491} |
0.3169 | 42.0 | 13272 | 0.2892 | 0.4478 | 0.4027 | {'rouge1': 0.5333235906541813, 'rouge2': 0.4443603162781552, 'rougeL': 0.5330034010136978, 'rougeLsum': 0.5326406825150651} |
0.31 | 43.0 | 13588 | 0.2802 | 0.4389 | 0.4129 | {'rouge1': 0.5304714101198026, 'rouge2': 0.44173185307676477, 'rougeL': 0.5300954189016169, 'rougeLsum': 0.5291506939812323} |
0.3073 | 44.0 | 13904 | 0.2869 | 0.4371 | 0.4120 | {'rouge1': 0.5308041781407598, 'rouge2': 0.44432884401915645, 'rougeL': 0.5314521664266045, 'rougeLsum': 0.5307981451360451} |
0.3071 | 45.0 | 14220 | 0.2814 | 0.4369 | 0.4120 | {'rouge1': 0.5342890753829816, 'rouge2': 0.44614556754179713, 'rougeL': 0.5335327097934472, 'rougeLsum': 0.5336625526854244} |
0.3051 | 46.0 | 14536 | 0.3069 | 0.4511 | 0.4040 | {'rouge1': 0.5334566908733793, 'rouge2': 0.4464639465223492, 'rougeL': 0.5332841123240422, 'rougeLsum': 0.5339599203398355} |
0.3076 | 47.0 | 14852 | 0.2819 | 0.4366 | 0.4157 | {'rouge1': 0.530656860856802, 'rouge2': 0.4408999090752492, 'rougeL': 0.5304515790394504, 'rougeLsum': 0.5301366492296686} |
0.3014 | 48.0 | 15168 | 0.2807 | 0.4285 | 0.4192 | {'rouge1': 0.5361600419352455, 'rouge2': 0.4503578843363222, 'rougeL': 0.5359299885307938, 'rougeLsum': 0.5357384533775356} |
0.2972 | 49.0 | 15484 | 0.2824 | 0.4358 | 0.4177 | {'rouge1': 0.530603352756585, 'rouge2': 0.4422960994397888, 'rougeL': 0.5301851986487092, 'rougeLsum': 0.5301276798933865} |
0.2961 | 50.0 | 15800 | 0.2763 | 0.4345 | 0.4183 | {'rouge1': 0.5346699053721147, 'rouge2': 0.4484006985044071, 'rougeL': 0.5346650096835568, 'rougeLsum': 0.5346366210123671} |
0.2901 | 51.0 | 16116 | 0.2807 | 0.4288 | 0.4229 | {'rouge1': 0.53348594544053, 'rouge2': 0.4455804499881205, 'rougeL': 0.5340921809742756, 'rougeLsum': 0.5333074099689908} |
0.2894 | 52.0 | 16432 | 0.2793 | 0.4283 | 0.4206 | {'rouge1': 0.5335896965768283, 'rouge2': 0.4442361617178019, 'rougeL': 0.5330441366886094, 'rougeLsum': 0.5329383884271198} |
0.2888 | 53.0 | 16748 | 0.2843 | 0.4282 | 0.4220 | {'rouge1': 0.5343661644986778, 'rouge2': 0.44887566681721847, 'rougeL': 0.5339094634205507, 'rougeLsum': 0.5337050579468707} |
0.2829 | 54.0 | 17064 | 0.2835 | 0.4304 | 0.4220 | {'rouge1': 0.5348240478926541, 'rouge2': 0.4467007676505266, 'rougeL': 0.5342014008290759, 'rougeLsum': 0.53434981500928} |
0.2832 | 55.0 | 17380 | 0.2822 | 0.4263 | 0.4242 | {'rouge1': 0.5339275785839641, 'rouge2': 0.4446465029673905, 'rougeL': 0.5333339620013073, 'rougeLsum': 0.5333133823561067} |
0.2805 | 56.0 | 17696 | 0.2815 | 0.4232 | 0.4287 | {'rouge1': 0.5344781277148793, 'rouge2': 0.4476924581277197, 'rougeL': 0.534592162739483, 'rougeLsum': 0.5344700824900488} |
0.2775 | 57.0 | 18012 | 0.2910 | 0.4372 | 0.4169 | {'rouge1': 0.5322912089641271, 'rouge2': 0.44439357593730966, 'rougeL': 0.53175473951791, 'rougeLsum': 0.5319678508912925} |
0.274 | 58.0 | 18328 | 0.2769 | 0.4250 | 0.4266 | {'rouge1': 0.5327890209405353, 'rouge2': 0.4435068957908703, 'rougeL': 0.5324932585023235, 'rougeLsum': 0.5323035263916269} |
0.2744 | 59.0 | 18644 | 0.2888 | 0.4347 | 0.4165 | {'rouge1': 0.5369185846025778, 'rouge2': 0.45081294950642925, 'rougeL': 0.5368148447616717, 'rougeLsum': 0.5363699979834334} |
0.2693 | 60.0 | 18960 | 0.2833 | 0.4206 | 0.4267 | {'rouge1': 0.5379121286350527, 'rouge2': 0.44993082399298157, 'rougeL': 0.5374364350211912, 'rougeLsum': 0.5372883986121286} |
0.2651 | 61.0 | 19276 | 0.2825 | 0.4232 | 0.4270 | {'rouge1': 0.5376823504804068, 'rouge2': 0.45156174386371273, 'rougeL': 0.5376195038173543, 'rougeLsum': 0.5376032731449271} |
0.2668 | 62.0 | 19592 | 0.2811 | 0.4247 | 0.4238 | {'rouge1': 0.5369572261952279, 'rouge2': 0.45145007189396835, 'rougeL': 0.5369176891616814, 'rougeLsum': 0.5369281955458396} |
0.2682 | 63.0 | 19908 | 0.2876 | 0.4292 | 0.4208 | {'rouge1': 0.5360702279479654, 'rouge2': 0.44952019776430263, 'rougeL': 0.5361599378420561, 'rougeLsum': 0.5360933388300235} |
0.2638 | 64.0 | 20224 | 0.2850 | 0.4234 | 0.4280 | {'rouge1': 0.5355398000726144, 'rouge2': 0.44988599153311215, 'rougeL': 0.5357454617923272, 'rougeLsum': 0.5352237022672879} |
0.2579 | 65.0 | 20540 | 0.2838 | 0.4209 | 0.4291 | {'rouge1': 0.5378115732530713, 'rouge2': 0.4534258803053122, 'rougeL': 0.5380832421302035, 'rougeLsum': 0.5372210890221913} |
0.2603 | 66.0 | 20856 | 0.2890 | 0.4266 | 0.4257 | {'rouge1': 0.5381010514134678, 'rouge2': 0.45331667387209773, 'rougeL': 0.5375791745511427, 'rougeLsum': 0.5374021723492373} |
0.255 | 67.0 | 21172 | 0.2818 | 0.4183 | 0.4350 | {'rouge1': 0.5355017433705689, 'rouge2': 0.44989734580071916, 'rougeL': 0.535482801756279, 'rougeLsum': 0.5354934130826885} |
0.2551 | 68.0 | 21488 | 0.2810 | 0.4179 | 0.4357 | {'rouge1': 0.5375696553271616, 'rouge2': 0.45360577667106594, 'rougeL': 0.5377789946365092, 'rougeLsum': 0.5369677497904943} |
0.251 | 69.0 | 21804 | 0.2834 | 0.4218 | 0.4304 | {'rouge1': 0.5396386043900199, 'rouge2': 0.4572424091851801, 'rougeL': 0.5393294726698683, 'rougeLsum': 0.5391327261552158} |
0.2471 | 70.0 | 22120 | 0.2845 | 0.4200 | 0.4328 | {'rouge1': 0.5354877901006788, 'rouge2': 0.45082461339724056, 'rougeL': 0.5353444001146976, 'rougeLsum': 0.5354203896768015} |
0.2525 | 71.0 | 22436 | 0.2854 | 0.4185 | 0.4327 | {'rouge1': 0.5379771980227901, 'rouge2': 0.4539120377352185, 'rougeL': 0.5374557940884827, 'rougeLsum': 0.5375123638442493} |
0.2477 | 72.0 | 22752 | 0.2787 | 0.4172 | 0.4375 | {'rouge1': 0.5378203059744184, 'rouge2': 0.45272792564037556, 'rougeL': 0.5380627369669113, 'rougeLsum': 0.537467174358347} |
0.2436 | 73.0 | 23068 | 0.2852 | 0.4161 | 0.4360 | {'rouge1': 0.5383278930625939, 'rouge2': 0.4520666490326796, 'rougeL': 0.5386092750984035, 'rougeLsum': 0.5376854664486965} |
0.2416 | 74.0 | 23384 | 0.2915 | 0.4215 | 0.4325 | {'rouge1': 0.536873865355633, 'rouge2': 0.45262116911172134, 'rougeL': 0.5370586331180425, 'rougeLsum': 0.5367586836795812} |
0.2436 | 75.0 | 23700 | 0.2893 | 0.4233 | 0.4281 | {'rouge1': 0.5377257941968341, 'rouge2': 0.45517921900071145, 'rougeL': 0.5379972695360247, 'rougeLsum': 0.5384993527592379} |
0.2452 | 76.0 | 24016 | 0.2817 | 0.4145 | 0.4397 | {'rouge1': 0.5418625595871899, 'rouge2': 0.46056568946973386, 'rougeL': 0.541468446578519, 'rougeLsum': 0.5414941095256001} |
0.2409 | 77.0 | 24332 | 0.2808 | 0.4143 | 0.4423 | {'rouge1': 0.5392257245009713, 'rouge2': 0.45607066336076896, 'rougeL': 0.538899825555043, 'rougeLsum': 0.5389811116142382} |
0.2373 | 78.0 | 24648 | 0.2841 | 0.4188 | 0.4362 | {'rouge1': 0.5359340139841655, 'rouge2': 0.45263639310304327, 'rougeL': 0.5360656594382994, 'rougeLsum': 0.5360069674357852} |
0.2393 | 79.0 | 24964 | 0.2809 | 0.4161 | 0.4386 | {'rouge1': 0.5391525866209272, 'rouge2': 0.4550841817969568, 'rougeL': 0.5389089291692388, 'rougeLsum': 0.5386840103505839} |
0.2337 | 80.0 | 25280 | 0.2884 | 0.4242 | 0.4298 | {'rouge1': 0.5418567120863942, 'rouge2': 0.4566567449838128, 'rougeL': 0.5411095603319156, 'rougeLsum': 0.5402954951078773} |
0.2334 | 81.0 | 25596 | 0.2824 | 0.4127 | 0.4392 | {'rouge1': 0.5418509341442581, 'rouge2': 0.45784578336087745, 'rougeL': 0.5426102015735883, 'rougeLsum': 0.5417528727173104} |
0.2299 | 82.0 | 25912 | 0.2852 | 0.4165 | 0.4379 | {'rouge1': 0.5416079064420798, 'rouge2': 0.45834119455660505, 'rougeL': 0.5411153340448218, 'rougeLsum': 0.5408759317756201} |
0.2277 | 83.0 | 26228 | 0.2890 | 0.4229 | 0.4330 | {'rouge1': 0.5394241274368741, 'rouge2': 0.45594996931284404, 'rougeL': 0.5396120164435794, 'rougeLsum': 0.5397634735014274} |
0.2313 | 84.0 | 26544 | 0.2895 | 0.4245 | 0.4283 | {'rouge1': 0.5453446541322129, 'rouge2': 0.46280284024839646, 'rougeL': 0.5453884134746472, 'rougeLsum': 0.5452681697277351} |
0.2315 | 85.0 | 26860 | 0.2792 | 0.4154 | 0.4396 | {'rouge1': 0.5429093525548536, 'rouge2': 0.4611084955713472, 'rougeL': 0.5423965358300354, 'rougeLsum': 0.5425649599783315} |
0.2257 | 86.0 | 27176 | 0.2938 | 0.4263 | 0.4275 | {'rouge1': 0.5426098354732698, 'rouge2': 0.45980507697840733, 'rougeL': 0.5420677423797466, 'rougeLsum': 0.5422784837208043} |
0.2269 | 87.0 | 27492 | 0.2832 | 0.4141 | 0.4417 | {'rouge1': 0.5407861890081991, 'rouge2': 0.45768316589935093, 'rougeL': 0.5399885330793596, 'rougeLsum': 0.5402799940195395} |
0.2235 | 88.0 | 27808 | 0.2838 | 0.4139 | 0.4415 | {'rouge1': 0.5418634614310525, 'rouge2': 0.4584064085262245, 'rougeL': 0.5417885182640119, 'rougeLsum': 0.5413254286379844} |
0.2274 | 89.0 | 28124 | 0.2850 | 0.4160 | 0.4371 | {'rouge1': 0.5405937575001661, 'rouge2': 0.458051805733929, 'rougeL': 0.5405110723843589, 'rougeLsum': 0.5404947581314099} |
0.2242 | 90.0 | 28440 | 0.2797 | 0.4128 | 0.4418 | {'rouge1': 0.5429184092787075, 'rouge2': 0.45977757613442993, 'rougeL': 0.5427116656838797, 'rougeLsum': 0.5427156050781381} |
0.2213 | 91.0 | 28756 | 0.2821 | 0.4143 | 0.4404 | {'rouge1': 0.5426471664268708, 'rouge2': 0.4593780623003551, 'rougeL': 0.5418104012736673, 'rougeLsum': 0.5418287085307663} |
0.2199 | 92.0 | 29072 | 0.2800 | 0.4117 | 0.4438 | {'rouge1': 0.5440449456366141, 'rouge2': 0.4601436725607881, 'rougeL': 0.5434666475262009, 'rougeLsum': 0.5432628993810478} |
0.2164 | 93.0 | 29388 | 0.2823 | 0.4138 | 0.4405 | {'rouge1': 0.5433433476910297, 'rouge2': 0.4619011422041214, 'rougeL': 0.5433923373560177, 'rougeLsum': 0.5431214493986625} |
0.2208 | 94.0 | 29704 | 0.2805 | 0.4105 | 0.4449 | {'rouge1': 0.5433842079822675, 'rouge2': 0.4625397714923867, 'rougeL': 0.5436312942590825, 'rougeLsum': 0.5433888917507208} |
0.2189 | 95.0 | 30020 | 0.2808 | 0.4104 | 0.4446 | {'rouge1': 0.5433769767494554, 'rouge2': 0.4621007632660318, 'rougeL': 0.5428793761331363, 'rougeLsum': 0.54301615334437} |
0.2173 | 96.0 | 30336 | 0.2809 | 0.4104 | 0.4441 | {'rouge1': 0.5424258876287353, 'rouge2': 0.4612827742689153, 'rougeL': 0.5424412108802839, 'rougeLsum': 0.5428323083881738} |
0.2156 | 97.0 | 30652 | 0.2815 | 0.4106 | 0.4434 | {'rouge1': 0.5442330389106769, 'rouge2': 0.4629085413716164, 'rougeL': 0.5439738249654837, 'rougeLsum': 0.5441096927446344} |
0.2158 | 98.0 | 30968 | 0.2815 | 0.4103 | 0.4453 | {'rouge1': 0.5430795464214497, 'rouge2': 0.46218161676383207, 'rougeL': 0.5431238378491201, 'rougeLsum': 0.5427355598729996} |
0.2119 | 99.0 | 31284 | 0.2812 | 0.4104 | 0.4445 | {'rouge1': 0.5438763456161022, 'rouge2': 0.462454778529061, 'rougeL': 0.543345705025007, 'rougeLsum': 0.5432414638117005} |
0.2095 | 99.6846 | 31500 | 0.2816 | 0.4111 | 0.4443 | {'rouge1': 0.5431378694774602, 'rouge2': 0.46319971488463374, 'rougeL': 0.543319325066259, 'rougeLsum': 0.5433415661075893} |
Framework versions
- Transformers 4.49.0
- Pytorch 2.6.0+cu124
- Datasets 3.2.0
- Tokenizers 0.21.0
- Downloads last month
- 10
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for ilyes25/baseline_sim
Base model
facebook/mms-1b-all