baseline_kabyle_sans_target_copy

This model is a fine-tuned version of facebook/mms-1b-all on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3300
  • Wer: 0.4544
  • Bleu: {'bleu': 0.3304851497144104, 'precisions': [0.5629466739967015, 0.38599435502326646, 0.2757586147779873, 0.19908131352619235], 'brevity_penalty': 1.0, 'length_ratio': 1.1015897047691143, 'translation_length': 14552, 'reference_length': 13210}
  • Rouge: {'rouge1': 0.000693000693000693, 'rouge2': 0.0, 'rougeL': 0.000693000693000693, 'rougeLsum': 0.000693000693000693}

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 32
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 100
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Bleu Rouge
No log 1.0 106 0.9667 0.9487 {'bleu': 0.0, 'precisions': [0.056907709622960045, 0.005558600172238315, 0.000441306266548985, 0.0], 'brevity_penalty': 1.0, 'length_ratio': 1.0761544284632854, 'translation_length': 14216, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
No log 2.0 212 0.4176 0.6724 {'bleu': 0.10625881894558567, 'precisions': [0.33676182479708927, 0.14732664020546346, 0.07119060143783973, 0.0360938439943854], 'brevity_penalty': 1.0, 'length_ratio': 1.081907645722937, 'translation_length': 14292, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
No log 3.0 318 0.3775 0.6512 {'bleu': 0.11776378351806044, 'precisions': [0.3562080188352607, 0.16164025234651486, 0.08031155344006924, 0.04159257063821379], 'brevity_penalty': 1.0, 'length_ratio': 1.0931869795609386, 'translation_length': 14441, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
No log 4.0 424 0.3567 0.6431 {'bleu': 0.1450928802240125, 'precisions': [0.36787637228165865, 0.18525431637890807, 0.10503723171265879, 0.061911440593067524], 'brevity_penalty': 1.0, 'length_ratio': 1.0825889477668433, 'translation_length': 14301, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
1.6921 5.0 530 0.3381 0.6328 {'bleu': 0.15019091561035408, 'precisions': [0.3790065214374913, 0.1921208850512682, 0.10851839000693962, 0.06439468991480087], 'brevity_penalty': 1.0, 'length_ratio': 1.0911430734292202, 'translation_length': 14414, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
1.6921 6.0 636 0.3306 0.6240 {'bleu': 0.1498806312492856, 'precisions': [0.3864984140118604, 0.19534420706026495, 0.10752410468319559, 0.062162427575370716], 'brevity_penalty': 1.0, 'length_ratio': 1.0978046934140802, 'translation_length': 14502, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
1.6921 7.0 742 0.3257 0.6240 {'bleu': 0.16749724280690337, 'precisions': [0.3874825174825175, 0.20712452360581785, 0.12423339758191694, 0.07894209577239031], 'brevity_penalty': 1.0, 'length_ratio': 1.0825132475397425, 'translation_length': 14300, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
1.6921 8.0 848 0.3285 0.6117 {'bleu': 0.15122750690196496, 'precisions': [0.39778393351800556, 0.20066169115949833, 0.10732213951878138, 0.06105512744516894], 'brevity_penalty': 1.0, 'length_ratio': 1.093111279333838, 'translation_length': 14440, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
1.6921 9.0 954 0.3078 0.6068 {'bleu': 0.17011315398726728, 'precisions': [0.4037817471392688, 0.21452401272402824, 0.12589550934824392, 0.07679249051328141], 'brevity_penalty': 1.0, 'length_ratio': 1.0849356548069644, 'translation_length': 14332, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.435 10.0 1060 0.3013 0.5811 {'bleu': 0.17122092075755774, 'precisions': [0.427158273381295, 0.2260816106969953, 0.12359550561797752, 0.0720063128822253], 'brevity_penalty': 1.0, 'length_ratio': 1.094322482967449, 'translation_length': 14456, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.435 11.0 1166 0.3021 0.6038 {'bleu': 0.1827621996373948, 'precisions': [0.4044835648466511, 0.22276509657225743, 0.1380404941660947, 0.08969969676220288], 'brevity_penalty': 1.0, 'length_ratio': 1.1008327024981075, 'translation_length': 14542, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.435 12.0 1272 0.2902 0.5627 {'bleu': 0.19951203392964884, 'precisions': [0.448223209328151, 0.25113767836482837, 0.15092865821905918, 0.09326065411298315], 'brevity_penalty': 1.0, 'length_ratio': 1.0906888720666161, 'translation_length': 14408, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.435 13.0 1378 0.2849 0.5676 {'bleu': 0.20264640978922904, 'precisions': [0.4454456552109528, 0.2538646780082343, 0.15450568678915136, 0.09651930386077215], 'brevity_penalty': 1.0, 'length_ratio': 1.0837244511733535, 'translation_length': 14316, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.435 14.0 1484 0.3012 0.5514 {'bleu': 0.21088514358328722, 'precisions': [0.4592842860097911, 0.2647013782542113, 0.16054058707067229, 0.10133542812254517], 'brevity_penalty': 1.0, 'length_ratio': 1.097880393641181, 'translation_length': 14503, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.3723 15.0 1590 0.2874 0.5421 {'bleu': 0.22129598752942345, 'precisions': [0.47020188053097345, 0.2745564856769833, 0.16980480221109, 0.10940272028385571], 'brevity_penalty': 1.0, 'length_ratio': 1.0949280847842544, 'translation_length': 14464, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.3723 16.0 1696 0.2793 0.5346 {'bleu': 0.23942091695612744, 'precisions': [0.4785072563925363, 0.2905503953327704, 0.18905386740331492, 0.1250123140577283], 'brevity_penalty': 1.0, 'length_ratio': 1.0953822861468585, 'translation_length': 14470, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.3723 17.0 1802 0.2947 0.5476 {'bleu': 0.2157776906765856, 'precisions': [0.46481979188202055, 0.2711968166513621, 0.1664516129032258, 0.10331632653061225], 'brevity_penalty': 1.0, 'length_ratio': 1.0984859954579864, 'translation_length': 14511, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.3723 18.0 1908 0.2908 0.5310 {'bleu': 0.22818988897659873, 'precisions': [0.48314762267284933, 0.2860218360756574, 0.17590590677159906, 0.11153884118053499], 'brevity_penalty': 1.0, 'length_ratio': 1.093792581377744, 'translation_length': 14449, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.3279 19.0 2014 0.2812 0.5196 {'bleu': 0.2524478824952918, 'precisions': [0.49611603767099743, 0.3051739926739927, 0.19998284881228026, 0.134141572154869], 'brevity_penalty': 1.0, 'length_ratio': 1.1012112036336108, 'translation_length': 14547, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.3279 20.0 2120 0.2737 0.5326 {'bleu': 0.23194477430896115, 'precisions': [0.483340217540961, 0.2874722922877016, 0.179553264604811, 0.11601019008426416], 'brevity_penalty': 1.0, 'length_ratio': 1.0996214988644966, 'translation_length': 14526, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.3279 21.0 2226 0.2868 0.5258 {'bleu': 0.24141463770041266, 'precisions': [0.4869193069648651, 0.2974547684759276, 0.19032841996379624, 0.12321762218507228], 'brevity_penalty': 1.0, 'length_ratio': 1.09666919000757, 'translation_length': 14487, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.3279 22.0 2332 0.2785 0.5164 {'bleu': 0.25195135672945107, 'precisions': [0.4995503908141385, 0.30935915168280315, 0.20058767608676864, 0.12999309596607161], 'brevity_penalty': 1.0, 'length_ratio': 1.0943981831945495, 'translation_length': 14457, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.3279 23.0 2438 0.2859 0.5149 {'bleu': 0.2534937082134097, 'precisions': [0.5006899406651028, 0.3091717109800015, 0.20106822880771882, 0.13266509433962265], 'brevity_penalty': 1.0, 'length_ratio': 1.0971990915972747, 'translation_length': 14494, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.299 24.0 2544 0.2739 0.5069 {'bleu': 0.2701767964972912, 'precisions': [0.5073325954620919, 0.3245216322139399, 0.21771823681936042, 0.14864864864864866], 'brevity_penalty': 1.0, 'length_ratio': 1.094322482967449, 'translation_length': 14456, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.299 25.0 2650 0.2785 0.5251 {'bleu': 0.2508539055762563, 'precisions': [0.493896910092767, 0.3066542577943229, 0.19814863330713475, 0.1319492963369598], 'brevity_penalty': 1.0, 'length_ratio': 1.0853141559424677, 'translation_length': 14337, 'reference_length': 13210} {'rouge1': 0.000693000693000693, 'rouge2': 0.0, 'rougeL': 0.000693000693000693, 'rougeLsum': 0.000693000693000693}
0.299 26.0 2756 0.2765 0.4949 {'bleu': 0.27780783623684363, 'precisions': [0.5206611570247934, 0.3361627284545385, 0.22425648959944988, 0.15174982844819135], 'brevity_penalty': 1.0, 'length_ratio': 1.0991672975018925, 'translation_length': 14520, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.299 27.0 2862 0.2766 0.4910 {'bleu': 0.2816699107692348, 'precisions': [0.5247375283057709, 0.3386900228484387, 0.228287841191067, 0.1551438322769381], 'brevity_penalty': 1.0, 'length_ratio': 1.1031794095382286, 'translation_length': 14573, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.299 28.0 2968 0.2810 0.5032 {'bleu': 0.2727987254975725, 'precisions': [0.511282295998364, 0.3258732799032209, 0.21980819825171857, 0.1512223403227365], 'brevity_penalty': 1.0, 'length_ratio': 1.110446631339894, 'translation_length': 14669, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.2706 29.0 3074 0.2732 0.4931 {'bleu': 0.2905577652720203, 'precisions': [0.5250973303670745, 0.34487288463024496, 0.2364759088537137, 0.1664348171701113], 'brevity_penalty': 1.0, 'length_ratio': 1.0888720666162, 'translation_length': 14384, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.2706 30.0 3180 0.2899 0.4986 {'bleu': 0.27490450540172295, 'precisions': [0.5185083817995211, 0.3299423018524142, 0.22116122431579846, 0.1509470616804274], 'brevity_penalty': 1.0, 'length_ratio': 1.1063588190764573, 'translation_length': 14615, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.2706 31.0 3286 0.2709 0.4770 {'bleu': 0.29856907971690344, 'precisions': [0.5376550400877133, 0.35429657794676805, 0.2444691210386948, 0.17064148739413998], 'brevity_penalty': 1.0, 'length_ratio': 1.1046934140802422, 'translation_length': 14593, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.2706 32.0 3392 0.2707 0.4834 {'bleu': 0.2953261697723721, 'precisions': [0.533453137959635, 0.352168905950096, 0.2404593334484545, 0.16839097448024437], 'brevity_penalty': 1.0, 'length_ratio': 1.0952308856926571, 'translation_length': 14468, 'reference_length': 13210} {'rouge1': 0.000693000693000693, 'rouge2': 0.0, 'rougeL': 0.000693000693000693, 'rougeLsum': 0.000693000693000693}
0.2706 33.0 3498 0.2908 0.5036 {'bleu': 0.2663726365857405, 'precisions': [0.511312530270532, 0.32144504227517295, 0.21276043918042709, 0.14397079139530294], 'brevity_penalty': 1.0, 'length_ratio': 1.0940953822861468, 'translation_length': 14453, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.2513 34.0 3604 0.2753 0.4755 {'bleu': 0.3024915870772061, 'precisions': [0.5405479829271651, 0.3590155163188871, 0.24836769759450172, 0.17370432056431862], 'brevity_penalty': 1.0, 'length_ratio': 1.0996214988644966, 'translation_length': 14526, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.2513 35.0 3710 0.2826 0.4930 {'bleu': 0.2797826324378019, 'precisions': [0.5237175951461666, 0.33603858816323406, 0.22533998967119986, 0.15451065082948856], 'brevity_penalty': 1.0, 'length_ratio': 1.0979560938682815, 'translation_length': 14504, 'reference_length': 13210} {'rouge1': 0.000693000693000693, 'rouge2': 0.0, 'rougeL': 0.000693000693000693, 'rougeLsum': 0.000693000693000693}
0.2513 36.0 3816 0.2831 0.4821 {'bleu': 0.290587546085915, 'precisions': [0.5336372427130819, 0.34944380514000767, 0.23628364389233955, 0.16182695147160153], 'brevity_penalty': 1.0, 'length_ratio': 1.095987887963664, 'translation_length': 14478, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.2513 37.0 3922 0.2828 0.4782 {'bleu': 0.2998307607023687, 'precisions': [0.53769608181881, 0.3563094872582131, 0.24583513163573587, 0.17159180457052797], 'brevity_penalty': 1.0, 'length_ratio': 1.095457986373959, 'translation_length': 14471, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.2323 38.0 4028 0.2842 0.4785 {'bleu': 0.30032010721069907, 'precisions': [0.5376669880181794, 0.3575196880495451, 0.24604675146098315, 0.17199137593100744], 'brevity_penalty': 1.0, 'length_ratio': 1.0993186979560938, 'translation_length': 14522, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.2323 39.0 4134 0.2827 0.4764 {'bleu': 0.3003364457627934, 'precisions': [0.5391268533772653, 0.3562666666666667, 0.24696113679164527, 0.17152892965167332], 'brevity_penalty': 1.0, 'length_ratio': 1.1028009084027253, 'translation_length': 14568, 'reference_length': 13210} {'rouge1': 0.000693000693000693, 'rouge2': 0.0, 'rougeL': 0.000693000693000693, 'rougeLsum': 0.000693000693000693}
0.2323 40.0 4240 0.2918 0.4924 {'bleu': 0.2810271547968192, 'precisions': [0.5239733094861388, 0.3370245914159157, 0.22796326495579777, 0.1549378486835666], 'brevity_penalty': 1.0, 'length_ratio': 1.1004542013626042, 'translation_length': 14537, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.2323 41.0 4346 0.2887 0.4829 {'bleu': 0.2892434084679841, 'precisions': [0.531514193415355, 0.34732183732641536, 0.2345022721426734, 0.1616813294232649], 'brevity_penalty': 1.0, 'length_ratio': 1.1013626040878122, 'translation_length': 14549, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.2323 42.0 4452 0.2830 0.4734 {'bleu': 0.3039499922078433, 'precisions': [0.5427570574902122, 0.36085696858798416, 0.24955024415317398, 0.17462642836214473], 'brevity_penalty': 1.0, 'length_ratio': 1.102119606358819, 'translation_length': 14559, 'reference_length': 13210} {'rouge1': 0.000693000693000693, 'rouge2': 0.0, 'rougeL': 0.000693000693000693, 'rougeLsum': 0.000693000693000693}
0.2188 43.0 4558 0.2870 0.4845 {'bleu': 0.28999202018304937, 'precisions': [0.5329621227744552, 0.3478327228327228, 0.23557156333076065, 0.1619401525523176], 'brevity_penalty': 1.0, 'length_ratio': 1.1012112036336108, 'translation_length': 14547, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.2188 44.0 4664 0.2872 0.4754 {'bleu': 0.3072650978682605, 'precisions': [0.5410298774611042, 0.36146143850798745, 0.25309278350515463, 0.18009014305310603], 'brevity_penalty': 1.0, 'length_ratio': 1.0996214988644966, 'translation_length': 14526, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.2188 45.0 4770 0.2981 0.4880 {'bleu': 0.29450630566128444, 'precisions': [0.5278408699841696, 0.3475469967904631, 0.2414326204586447, 0.1698501322362621], 'brevity_penalty': 1.0, 'length_ratio': 1.0998485995457987, 'translation_length': 14529, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.2188 46.0 4876 0.3082 0.4912 {'bleu': 0.28537676720392025, 'precisions': [0.5250017242568453, 0.3400735294117647, 0.2324119521226212, 0.159838883976815], 'brevity_penalty': 1.0, 'length_ratio': 1.0975775927327782, 'translation_length': 14499, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.2188 47.0 4982 0.3059 0.4920 {'bleu': 0.2815963604272895, 'precisions': [0.520885028949545, 0.3354764638346728, 0.2278437446222681, 0.15793089909697683], 'brevity_penalty': 1.0, 'length_ratio': 1.0982588947766843, 'translation_length': 14508, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.2061 48.0 5088 0.2959 0.4736 {'bleu': 0.3052681437803712, 'precisions': [0.5394321766561514, 0.3586269883552782, 0.2500854993160055, 0.17949717404014812], 'brevity_penalty': 1.0, 'length_ratio': 1.1038607115821348, 'translation_length': 14582, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.2061 49.0 5194 0.2868 0.4719 {'bleu': 0.3050000146287494, 'precisions': [0.5431206552412416, 0.3605379795200978, 0.2502791376792923, 0.17657428263637254], 'brevity_penalty': 1.0, 'length_ratio': 1.0998485995457987, 'translation_length': 14529, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.2061 50.0 5300 0.2965 0.4673 {'bleu': 0.31695229086128907, 'precisions': [0.5498350288699477, 0.370927127050744, 0.26196192762819415, 0.18889323425889715], 'brevity_penalty': 1.0, 'length_ratio': 1.1012869038607116, 'translation_length': 14548, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.2061 51.0 5406 0.2932 0.4685 {'bleu': 0.3169340335274348, 'precisions': [0.5475896168108776, 0.3714459943593262, 0.263103802672148, 0.18853739504003125], 'brevity_penalty': 1.0, 'length_ratio': 1.1023467070401212, 'translation_length': 14562, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.1953 52.0 5512 0.2980 0.4703 {'bleu': 0.30862597712853457, 'precisions': [0.5459116897430598, 0.3642343582683188, 0.25440632791677414, 0.179348891939596], 'brevity_penalty': 1.0, 'length_ratio': 1.0989401968205905, 'translation_length': 14517, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.1953 53.0 5618 0.2907 0.4660 {'bleu': 0.31450995783489116, 'precisions': [0.5497799779977998, 0.3694374475230898, 0.2598215817464402, 0.18540973987874046], 'brevity_penalty': 1.0, 'length_ratio': 1.1009841029523089, 'translation_length': 14544, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.1953 54.0 5724 0.2964 0.4655 {'bleu': 0.3091129279744492, 'precisions': [0.549055829228243, 0.36658316252941625, 0.2548167092924126, 0.1780130134990774], 'brevity_penalty': 1.0, 'length_ratio': 1.1064345193035579, 'translation_length': 14616, 'reference_length': 13210} {'rouge1': 0.000462000462000462, 'rouge2': 0.0, 'rougeL': 0.000462000462000462, 'rougeLsum': 0.000462000462000462}
0.1953 55.0 5830 0.3043 0.4689 {'bleu': 0.3100430620542407, 'precisions': [0.5470438433477872, 0.36428243924805137, 0.25491711758137936, 0.18189832500734646], 'brevity_penalty': 1.0, 'length_ratio': 1.0998485995457987, 'translation_length': 14529, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.1953 56.0 5936 0.2953 0.4621 {'bleu': 0.3177578762736204, 'precisions': [0.5543403494290824, 0.371897670866743, 0.26235839340885686, 0.18849089841456254], 'brevity_penalty': 1.0, 'length_ratio': 1.1005299015897048, 'translation_length': 14538, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.1854 57.0 6042 0.2983 0.4577 {'bleu': 0.32553472762690394, 'precisions': [0.5576777739608382, 0.3807962172056132, 0.2710600736995458, 0.19509574052364204], 'brevity_penalty': 1.0, 'length_ratio': 1.1018168054504163, 'translation_length': 14555, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.1854 58.0 6148 0.2950 0.4662 {'bleu': 0.3150549597648921, 'precisions': [0.5487394380710311, 0.36929998474912307, 0.26098877559763517, 0.18628504444661523], 'brevity_penalty': 1.0, 'length_ratio': 1.1019682059046176, 'translation_length': 14557, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.1854 59.0 6254 0.2979 0.4640 {'bleu': 0.3162920692997877, 'precisions': [0.5506298616369519, 0.371293182512993, 0.26217678893565843, 0.18671630094043887], 'brevity_penalty': 1.0, 'length_ratio': 1.0996971990915974, 'translation_length': 14527, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.1854 60.0 6360 0.2976 0.4637 {'bleu': 0.31869505159104133, 'precisions': [0.5518334711883099, 0.37305778798316114, 0.26424023403889174, 0.1896348645465253], 'brevity_penalty': 1.0, 'length_ratio': 1.0982588947766843, 'translation_length': 14508, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.1854 61.0 6466 0.2974 0.4673 {'bleu': 0.31668941878935697, 'precisions': [0.5491333471445342, 0.3696119036662065, 0.26226821905993963, 0.188957779746088], 'brevity_penalty': 1.0, 'length_ratio': 1.096214988644966, 'translation_length': 14481, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.1761 62.0 6572 0.3018 0.4667 {'bleu': 0.3146151709852398, 'precisions': [0.549707199448846, 0.3703335373317013, 0.260297532031989, 0.18489455615497793], 'brevity_penalty': 1.0, 'length_ratio': 1.0987887963663892, 'translation_length': 14515, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.1761 63.0 6678 0.3028 0.4676 {'bleu': 0.31236552029402453, 'precisions': [0.5481588348447376, 0.3670403416456951, 0.2574978577549272, 0.18376318874560374], 'brevity_penalty': 1.0, 'length_ratio': 1.101892505677517, 'translation_length': 14556, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.1761 64.0 6784 0.3042 0.4621 {'bleu': 0.3215568053974061, 'precisions': [0.5536894273127754, 0.37523882307986245, 0.2670503349939873, 0.1926920062695925], 'brevity_penalty': 1.0, 'length_ratio': 1.099772899318698, 'translation_length': 14528, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.1761 65.0 6890 0.3006 0.4631 {'bleu': 0.31842620307399816, 'precisions': [0.5529938059187887, 0.3728127149079239, 0.2640845070422535, 0.18883447600391773], 'brevity_penalty': 1.0, 'length_ratio': 1.0999242997728993, 'translation_length': 14530, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.1761 66.0 6996 0.3032 0.4606 {'bleu': 0.3207308393608921, 'precisions': [0.5565894678797905, 0.37634902411021814, 0.26570297711237306, 0.19012563800549667], 'brevity_penalty': 1.0, 'length_ratio': 1.0982588947766843, 'translation_length': 14508, 'reference_length': 13210} {'rouge1': 0.000693000693000693, 'rouge2': 0.0, 'rougeL': 0.000693000693000693, 'rougeLsum': 0.000693000693000693}
0.1649 67.0 7102 0.3126 0.4602 {'bleu': 0.3207726654824389, 'precisions': [0.554032755430686, 0.37741444866920154, 0.26633637994362347, 0.19010999707972354], 'brevity_penalty': 1.0, 'length_ratio': 1.1046934140802422, 'translation_length': 14593, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.1649 68.0 7208 0.3217 0.4602 {'bleu': 0.3230138585379056, 'precisions': [0.555189230029535, 0.37679170478804513, 0.26762614580656213, 0.194452583260084], 'brevity_penalty': 1.0, 'length_ratio': 1.102119606358819, 'translation_length': 14559, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.1649 69.0 7314 0.3126 0.4610 {'bleu': 0.32145603233088194, 'precisions': [0.5556242699099841, 0.37719298245614036, 0.26682094797291506, 0.1909508453044073], 'brevity_penalty': 1.0, 'length_ratio': 1.101665404996215, 'translation_length': 14553, 'reference_length': 13210} {'rouge1': 0.000693000693000693, 'rouge2': 0.0, 'rougeL': 0.000693000693000693, 'rougeLsum': 0.000693000693000693}
0.1649 70.0 7420 0.3164 0.4569 {'bleu': 0.3274512710726435, 'precisions': [0.5580820223947242, 0.3811956687509532, 0.2730700025704738, 0.19790954381166356], 'brevity_penalty': 1.0, 'length_ratio': 1.1019682059046176, 'translation_length': 14557, 'reference_length': 13210} {'rouge1': 0.000693000693000693, 'rouge2': 0.0, 'rougeL': 0.000693000693000693, 'rougeLsum': 0.000693000693000693}
0.1586 71.0 7526 0.3146 0.4622 {'bleu': 0.31911162358084005, 'precisions': [0.5556706012286878, 0.37496166819993865, 0.263684165158176, 0.18874790990459328], 'brevity_penalty': 1.0, 'length_ratio': 1.09666919000757, 'translation_length': 14487, 'reference_length': 13210} {'rouge1': 0.000693000693000693, 'rouge2': 0.0, 'rougeL': 0.000693000693000693, 'rougeLsum': 0.000693000693000693}
0.1586 72.0 7632 0.3131 0.4592 {'bleu': 0.32389492799699293, 'precisions': [0.5574266272596055, 0.3792919273615138, 0.2690559890251222, 0.19346954736533387], 'brevity_penalty': 1.0, 'length_ratio': 1.1013626040878122, 'translation_length': 14549, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.1586 73.0 7738 0.3188 0.4583 {'bleu': 0.32655365796666525, 'precisions': [0.559152401987852, 0.38014564967420467, 0.271332528874332, 0.1971678631133838], 'brevity_penalty': 1.0, 'length_ratio': 1.0967448902346708, 'translation_length': 14488, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.1586 74.0 7844 0.3153 0.4610 {'bleu': 0.32129410415655674, 'precisions': [0.5561979489297267, 0.37704416934128077, 0.26685562140341834, 0.1904202174551866], 'brevity_penalty': 1.0, 'length_ratio': 1.0998485995457987, 'translation_length': 14529, 'reference_length': 13210} {'rouge1': 0.000693000693000693, 'rouge2': 0.0, 'rougeL': 0.000693000693000693, 'rougeLsum': 0.000693000693000693}
0.1586 75.0 7950 0.3161 0.4589 {'bleu': 0.32339321657296594, 'precisions': [0.5591479387839514, 0.3787797596264258, 0.2677280550774527, 0.19289290271915185], 'brevity_penalty': 1.0, 'length_ratio': 1.098107494322483, 'translation_length': 14506, 'reference_length': 13210} {'rouge1': 0.000693000693000693, 'rouge2': 0.0, 'rougeL': 0.000693000693000693, 'rougeLsum': 0.000693000693000693}
0.1515 76.0 8056 0.3192 0.4618 {'bleu': 0.316903562701023, 'precisions': [0.5559757942511346, 0.3729292312390259, 0.26115305422100205, 0.18626491880258267], 'brevity_penalty': 1.0, 'length_ratio': 1.1008327024981075, 'translation_length': 14542, 'reference_length': 13210} {'rouge1': 0.000693000693000693, 'rouge2': 0.0, 'rougeL': 0.000693000693000693, 'rougeLsum': 0.000693000693000693}
0.1515 77.0 8162 0.3258 0.4608 {'bleu': 0.32119824364789534, 'precisions': [0.5559601511508073, 0.3774405125076266, 0.2667752163852944, 0.19013190034196384], 'brevity_penalty': 1.0, 'length_ratio': 1.1018168054504163, 'translation_length': 14555, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.1515 78.0 8268 0.3163 0.4606 {'bleu': 0.32188554050314144, 'precisions': [0.5552259399793033, 0.37664725712534475, 0.2670341976053062, 0.19223587223587224], 'brevity_penalty': 1.0, 'length_ratio': 1.0972747918243755, 'translation_length': 14495, 'reference_length': 13210} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.1515 79.0 8374 0.3185 0.4587 {'bleu': 0.3237016833854913, 'precisions': [0.5590416037347247, 0.3778099519926846, 0.2683219178082192, 0.19373414015225454], 'brevity_penalty': 1.0, 'length_ratio': 1.102649507948524, 'translation_length': 14566, 'reference_length': 13210} {'rouge1': 0.000693000693000693, 'rouge2': 0.0, 'rougeL': 0.000693000693000693, 'rougeLsum': 0.000693000693000693}
0.1515 80.0 8480 0.3170 0.4541 {'bleu': 0.3259830632410305, 'precisions': [0.5614325068870524, 0.3814330503938212, 0.2704143029052776, 0.195], 'brevity_penalty': 1.0, 'length_ratio': 1.0991672975018925, 'translation_length': 14520, 'reference_length': 13210} {'rouge1': 0.000693000693000693, 'rouge2': 0.0, 'rougeL': 0.000693000693000693, 'rougeLsum': 0.000693000693000693}
0.1482 81.0 8586 0.3184 0.4580 {'bleu': 0.32730996600026774, 'precisions': [0.5597020073118576, 0.3827945457331086, 0.27258634053914393, 0.19652156824211456], 'brevity_penalty': 1.0, 'length_ratio': 1.0974261922785769, 'translation_length': 14497, 'reference_length': 13210} {'rouge1': 0.000693000693000693, 'rouge2': 0.0, 'rougeL': 0.000693000693000693, 'rougeLsum': 0.000693000693000693}
0.1482 82.0 8692 0.3259 0.4571 {'bleu': 0.32714415344911474, 'precisions': [0.5595417752778159, 0.3824133993148078, 0.2722374273007184, 0.1966270228114642], 'brevity_penalty': 1.0, 'length_ratio': 1.103557910673732, 'translation_length': 14578, 'reference_length': 13210} {'rouge1': 0.000693000693000693, 'rouge2': 0.0, 'rougeL': 0.000693000693000693, 'rougeLsum': 0.000693000693000693}
0.1482 83.0 8798 0.3208 0.4585 {'bleu': 0.32297683121658155, 'precisions': [0.5592123381988433, 0.3792523507377112, 0.2683450764736209, 0.19119952959623676], 'brevity_penalty': 1.0, 'length_ratio': 1.0994700984102952, 'translation_length': 14524, 'reference_length': 13210} {'rouge1': 0.000693000693000693, 'rouge2': 0.0, 'rougeL': 0.000693000693000693, 'rougeLsum': 0.000693000693000693}
0.1482 84.0 8904 0.3202 0.4565 {'bleu': 0.3259497954714864, 'precisions': [0.5598787962261552, 0.38155681296834376, 0.27099269445638163, 0.19498088422703658], 'brevity_penalty': 1.0, 'length_ratio': 1.0992429977289933, 'translation_length': 14521, 'reference_length': 13210} {'rouge1': 0.000693000693000693, 'rouge2': 0.0, 'rougeL': 0.000693000693000693, 'rougeLsum': 0.000693000693000693}
0.143 85.0 9010 0.3196 0.4582 {'bleu': 0.3261720428963995, 'precisions': [0.5596614368290669, 0.38146535258614106, 0.27219646230465394, 0.19477085781433606], 'brevity_penalty': 1.0, 'length_ratio': 1.1000757002271007, 'translation_length': 14532, 'reference_length': 13210} {'rouge1': 0.000693000693000693, 'rouge2': 0.0, 'rougeL': 0.000693000693000693, 'rougeLsum': 0.000693000693000693}
0.143 86.0 9116 0.3321 0.4583 {'bleu': 0.3250716033098527, 'precisions': [0.5588943134153889, 0.38, 0.2699665437076435, 0.19475692066907951], 'brevity_penalty': 1.0, 'length_ratio': 1.100908402725208, 'translation_length': 14543, 'reference_length': 13210} {'rouge1': 0.000693000693000693, 'rouge2': 0.0, 'rougeL': 0.000693000693000693, 'rougeLsum': 0.000693000693000693}
0.143 87.0 9222 0.3228 0.4556 {'bleu': 0.3283395597904491, 'precisions': [0.5614998969850972, 0.3832901356914164, 0.2739186295503212, 0.19714871594570843], 'brevity_penalty': 1.0, 'length_ratio': 1.1022710068130204, 'translation_length': 14561, 'reference_length': 13210} {'rouge1': 0.001386001386001386, 'rouge2': 0.0, 'rougeL': 0.001386001386001386, 'rougeLsum': 0.001386001386001386}
0.143 88.0 9328 0.3272 0.4553 {'bleu': 0.32702135921625014, 'precisions': [0.5603104822090946, 0.3827678231033168, 0.2727039067854695, 0.195546005079117], 'brevity_penalty': 1.0, 'length_ratio': 1.1020439061317184, 'translation_length': 14558, 'reference_length': 13210} {'rouge1': 0.000693000693000693, 'rouge2': 0.0, 'rougeL': 0.000693000693000693, 'rougeLsum': 0.000693000693000693}
0.143 89.0 9434 0.3285 0.4533 {'bleu': 0.3317606887076569, 'precisions': [0.5640743289745355, 0.3871781156873233, 0.2769666781174854, 0.20027421408285181], 'brevity_penalty': 1.0, 'length_ratio': 1.0999242997728993, 'translation_length': 14530, 'reference_length': 13210} {'rouge1': 0.000693000693000693, 'rouge2': 0.0, 'rougeL': 0.000693000693000693, 'rougeLsum': 0.000693000693000693}
0.1386 90.0 9540 0.3298 0.4550 {'bleu': 0.3285640203482956, 'precisions': [0.5620267107255955, 0.38485056944125967, 0.2738831615120275, 0.1967277358675419], 'brevity_penalty': 1.0, 'length_ratio': 1.0996214988644966, 'translation_length': 14526, 'reference_length': 13210} {'rouge1': 0.000693000693000693, 'rouge2': 0.0, 'rougeL': 0.000693000693000693, 'rougeLsum': 0.000693000693000693}
0.1386 91.0 9646 0.3254 0.4544 {'bleu': 0.3290136476679272, 'precisions': [0.563208909053413, 0.3847680097680098, 0.273904467884401, 0.19741859782927546], 'brevity_penalty': 1.0, 'length_ratio': 1.1012112036336108, 'translation_length': 14547, 'reference_length': 13210} {'rouge1': 0.000693000693000693, 'rouge2': 0.0, 'rougeL': 0.000693000693000693, 'rougeLsum': 0.000693000693000693}
0.1386 92.0 9752 0.3303 0.4555 {'bleu': 0.33061794003559575, 'precisions': [0.5619289688809508, 0.38500838798230896, 0.27538342901208124, 0.20054703526423756], 'brevity_penalty': 1.0, 'length_ratio': 1.1019682059046176, 'translation_length': 14557, 'reference_length': 13210} {'rouge1': 0.000693000693000693, 'rouge2': 0.0, 'rougeL': 0.000693000693000693, 'rougeLsum': 0.000693000693000693}
0.1386 93.0 9858 0.3278 0.4546 {'bleu': 0.3293351458622286, 'precisions': [0.5625988312134754, 0.38406350175545717, 0.27463761900677586, 0.19823960880195599], 'brevity_penalty': 1.0, 'length_ratio': 1.1010598031794094, 'translation_length': 14545, 'reference_length': 13210} {'rouge1': 0.000693000693000693, 'rouge2': 0.0, 'rougeL': 0.000693000693000693, 'rougeLsum': 0.000693000693000693}
0.1386 94.0 9964 0.3283 0.4539 {'bleu': 0.3298655689520526, 'precisions': [0.5631274927795351, 0.38529658752576534, 0.27505147563486615, 0.19839561729602817], 'brevity_penalty': 1.0, 'length_ratio': 1.1008327024981075, 'translation_length': 14542, 'reference_length': 13210} {'rouge1': 0.000693000693000693, 'rouge2': 0.0, 'rougeL': 0.000693000693000693, 'rougeLsum': 0.000693000693000693}
0.1327 95.0 10070 0.3312 0.4536 {'bleu': 0.33147903443612675, 'precisions': [0.5637925111645483, 0.3864399023794997, 0.27680178250064275, 0.20019540791402052], 'brevity_penalty': 1.0, 'length_ratio': 1.1018168054504163, 'translation_length': 14555, 'reference_length': 13210} {'rouge1': 0.000693000693000693, 'rouge2': 0.0, 'rougeL': 0.000693000693000693, 'rougeLsum': 0.000693000693000693}
0.1327 96.0 10176 0.3312 0.4530 {'bleu': 0.33257718368218486, 'precisions': [0.5640197829372167, 0.3874952344643538, 0.27793008910212474, 0.2014065247118578], 'brevity_penalty': 1.0, 'length_ratio': 1.1020439061317184, 'translation_length': 14558, 'reference_length': 13210} {'rouge1': 0.000693000693000693, 'rouge2': 0.0, 'rougeL': 0.000693000693000693, 'rougeLsum': 0.000693000693000693}
0.1327 97.0 10282 0.3314 0.4536 {'bleu': 0.33199702822536054, 'precisions': [0.5630067335440428, 0.3867744641903745, 0.2771683236201577, 0.20128981825288256], 'brevity_penalty': 1.0, 'length_ratio': 1.1017411052233157, 'translation_length': 14554, 'reference_length': 13210} {'rouge1': 0.000693000693000693, 'rouge2': 0.0, 'rougeL': 0.000693000693000693, 'rougeLsum': 0.000693000693000693}
0.1327 98.0 10388 0.3311 0.4538 {'bleu': 0.33149073029031967, 'precisions': [0.5633076711575474, 0.386570011446013, 0.27679643285885785, 0.2003324208056316], 'brevity_penalty': 1.0, 'length_ratio': 1.1012869038607116, 'translation_length': 14548, 'reference_length': 13210} {'rouge1': 0.000693000693000693, 'rouge2': 0.0, 'rougeL': 0.000693000693000693, 'rougeLsum': 0.000693000693000693}
0.1327 99.0 10494 0.3300 0.4543 {'bleu': 0.3305549917495247, 'precisions': [0.5630153930731171, 0.386223205431383, 0.275930053145894, 0.19898358092259577], 'brevity_penalty': 1.0, 'length_ratio': 1.1015897047691143, 'translation_length': 14552, 'reference_length': 13210} {'rouge1': 0.000693000693000693, 'rouge2': 0.0, 'rougeL': 0.000693000693000693, 'rougeLsum': 0.000693000693000693}
0.1306 99.0570 10500 0.3300 0.4544 {'bleu': 0.3304851497144104, 'precisions': [0.5629466739967015, 0.38599435502326646, 0.2757586147779873, 0.19908131352619235], 'brevity_penalty': 1.0, 'length_ratio': 1.1015897047691143, 'translation_length': 14552, 'reference_length': 13210} {'rouge1': 0.000693000693000693, 'rouge2': 0.0, 'rougeL': 0.000693000693000693, 'rougeLsum': 0.000693000693000693}

Framework versions

  • Transformers 4.49.0
  • Pytorch 2.6.0+cu124
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
18
Safetensors
Model size
965M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for ilyes25/baseline_arabic_kabyle

Finetuned
(292)
this model