habdine amr-mohamed commited on
Commit
ca3a230
1 Parent(s): 5acafce

Update README.md (#2)

Browse files

- Update README.md (6b9ce5bc7691bbf1c131db2c4b62110fd7fa91d5)


Co-authored-by: Amr Mohamed <[email protected]>

Files changed (1) hide show
  1. README.md +11 -383
README.md CHANGED
@@ -354,378 +354,6 @@ Our training dataset [Darija-SFT-Mixture](https://huggingface.co/datasets/MBZUAI
354
  Atlas-Chat models are based on Gemma 2 models. The Atlas-Chat models were trained using 8 Nvidia's A100 80 GB GPUs in parallel using FSDP on AWS Sagemaker. The model is trained using HuggingFace transformers and parameter-efficient fine-tuning with LoRA rank of 256.
355
 
356
 
357
- <!--
358
- ## Evaluation
359
- The Atlas-Chat models were evaluated on a comprehensive suite of tasks using various datasets and benchmarks to assess their performance across multiple dimensions. These included tasks such as:
360
-
361
- * **DarijaMMLU:** A Darija version of ArabicMMLU and MMLU benchmarks translated from MSA and English respectively.
362
- * **DarijaHellaSwag:** A Darija version of HellaSwag.
363
- * **Belebele Ary_Arab:** Belebele is a multiple-choice machine reading comprehension dataset published by Facebook spanning 122 language variants. The Evaluation is done on the Ary_Arab part of Belebele that refers to Darija.
364
- * **Sentiment Analysis.**
365
- * **Translation:** Including six directions and four languages: Darija, MSA, English and French.
366
- * **Transliteration:** Transforming a sentence from Darija (written in Arabic characters) to Arabizi (Written in Latin characters) and vice-versa.
367
- * **Summarization.**
368
-
369
- The models were compared against a collection of existing open-source Arabic models to gauge their effectiveness, with a particular focus on performance in Darija. All scores are based on zero-shot performance. The prompts are written mainly in Darija. The metric used for DarijaMMLU, DarijaHellaSwag, Belebele Ary and Sentiment Analysis is the normalized accuracy. We used [Language Model Evaluation Harness](https://github.com/MBZUAI-Paris/lm-evaluation-harness-atlas-chat) to conduct these evaluations.
370
-
371
- <table>
372
- <tr>
373
- <td rowspan="2">Model</td>
374
- <td rowspan="2"><a href="https://huggingface.co/datasets/MBZUAI-Paris/DarijaMMLU" target="_blank">DarijaMMLU</a></td>
375
- <td rowspan="2"><a href="MBZUAI-Paris/DarijaHellaSwag" target="_blank">DarijaHellaSwag</a></td>
376
- <td rowspan="2"><a href="https://huggingface.co/datasets/facebook/belebele/viewer/ary_Arab" target="_blank">Belebele Ary</a></td>
377
- <td rowspan="2"><a href="https://huggingface.co/datasets/MBZUAI-Paris/DarijaBench" target="_blank">Sentiment Analysis</a></td>
378
- <td colspan="2"><a href="https://huggingface.co/datasets/MBZUAI-Paris/DarijaBench" target="_blank">DODa-10k (Translation)</a></td>
379
- <td colspan="2"><a href="https://huggingface.co/datasets/MBZUAI-Paris/DarijaBench" target="_blank">MADAR (Translation)</a></td>
380
- <td colspan="2"><a href="https://huggingface.co/datasets/MBZUAI-Paris/DarijaBench" target="_blank">FLORES+ (Translation)</a></td>
381
- <td colspan="2"><a href="https://huggingface.co/datasets/MBZUAI-Paris/DarijaBench" target="_blank">NLLB-Seed (Translation)</a></td>
382
- <td colspan="2"><a href="https://huggingface.co/datasets/MBZUAI-Paris/DarijaBench" target="_blank">DODa-10k (Transliteration)</a></td>
383
- <td rowspan="2"><a href="https://huggingface.co/datasets/MBZUAI-Paris/DarijaBench" target="_blank">MArSum (Summarization)</a><br/>(LLM as a judge)</td>
384
- </tr>
385
- <tr>
386
- <td>BLEU</td>
387
- <td>chrF</td>
388
- <td>BLEU</td>
389
- <td>chrF</td>
390
- <td>BLEU</td>
391
- <td>chrF</td>
392
- <td>BLEU</td>
393
- <td>chrF</td>
394
- <td>BLEU</td>
395
- <td>chrF</td>
396
- </tr>
397
- <tr>
398
- <td><a href="https://huggingface.co/inceptionai/jais-family-1p3b-chat" target="_blank">jais-family-1p3b-chat</a></td>
399
- <td>35.39</td>
400
- <td>32.51</td>
401
- <td>38.33</td>
402
- <td>45.29</td>
403
- <td>00.13</td>
404
- <td>06.18</td>
405
- <td>00.50</td>
406
- <td>15.43</td>
407
- <td>02.44</td>
408
- <td>19.14</td>
409
- <td>01.99</td>
410
- <td>12.60</td>
411
- <td>00.01</td>
412
- <td>03.01</td>
413
- <td>00.50</td>
414
- </tr>
415
- <tr>
416
- <td><a href="https://huggingface.co/inceptionai/jais-family-2p7b-chat" target="_blank">jais-family-2p7b-chat</a></td>
417
- <td>37.44</td>
418
- <td>34.49</td>
419
- <td>44.11</td>
420
- <td>51.56</td>
421
- <td>00.25</td>
422
- <td>07.46</td>
423
- <td>00.62</td>
424
- <td>16.36</td>
425
- <td>04.25</td>
426
- <td>18.22</td>
427
- <td>03.10</td>
428
- <td>08.19</td>
429
- <td>00.01</td>
430
- <td>03.27</td>
431
- <td>00.90</td>
432
- </tr>
433
- <tr>
434
- <td><a href="https://huggingface.co/google/gemma-2-2b-it" target="_blank">gemma-2-2b-it</a></td>
435
- <td>28.58</td>
436
- <td>32.42</td>
437
- <td>25.22</td>
438
- <td>53.36</td>
439
- <td>00.10</td>
440
- <td>04.96</td>
441
- <td>00.12</td>
442
- <td>06.66</td>
443
- <td>01.55</td>
444
- <td>18.59</td>
445
- <td>02.78</td>
446
- <td>23.69</td>
447
- <td>00.01</td>
448
- <td>02.08</td>
449
- <td>06.80</td>
450
- </tr>
451
- <tr>
452
- <td><a href="meta-llama/Llama-3.2-1B-Instruct" target="_blank">Llama-3.2-1B-Instruct</a></td>
453
- <td>27.66</td>
454
- <td>26.88</td>
455
- <td>28.89</td>
456
- <td>46.27</td>
457
- <td>00.07</td>
458
- <td>05.95</td>
459
- <td>00.80</td>
460
- <td>18.71</td>
461
- <td>04.53</td>
462
- <td>18.39</td>
463
- <td>04.52</td>
464
- <td>17.06</td>
465
- <td>00.02</td>
466
- <td>03.74</td>
467
- <td>08.23</td>
468
- </tr>
469
- <tr>
470
- <td><a href="meta-llama/Llama-3.2-3B-Instruct" target="_blank">Llama-3.2-3B-Instruct</a></td>
471
- <td>32.60</td>
472
- <td>28.33</td>
473
- <td>38.00</td>
474
- <td>49.20</td>
475
- <td>00.62</td>
476
- <td>13.67</td>
477
- <td>01.18</td>
478
- <td>22.12</td>
479
- <td>08.59</td>
480
- <td>35.21</td>
481
- <td>13.75</td>
482
- <td>43.63</td>
483
- <td>00.21</td>
484
- <td>09.68</td>
485
- <td>08.23</td>
486
- </tr>
487
- <tr>
488
- <td><strong><a href="https://huggingface.co/MBZUAI-Paris/Atlas-Chat-2B" target="_blank">Atlas-Chat-2B</a></strong></td>
489
- <td><b>44.97</td>
490
- <td><b>41.48</td>
491
- <td><b>53.89</td>
492
- <td><b>73.99</td>
493
- <td><b>22.76</td>
494
- <td><b>44.86</td>
495
- <td><b>16.67</td>
496
- <td><b>41.64</td>
497
- <td><b>14.92</td>
498
- <td><b>43.03</td>
499
- <td><b>23.88</td>
500
- <td><b>52.19</td>
501
- <td><b>08.18</td>
502
- <td><b>21.54</td>
503
- <td><b>55.22</td>
504
- </tr>
505
- <tr style="border-top: 4px solid;"></tr>
506
- <tr>
507
- <td><a href="https://huggingface.co/inceptionai/jais-family-6p7b-chat" target="_blank">jais-family-6p7b-chat</a></td>
508
- <td>39.96</td>
509
- <td>41.57</td>
510
- <td>51.22</td>
511
- <td>56.78</td>
512
- <td>00.73</td>
513
- <td>11.85</td>
514
- <td>01.88</td>
515
- <td>23.22</td>
516
- <td>04.25</td>
517
- <td>18.22</td>
518
- <td>04.62</td>
519
- <td>20.22</td>
520
- <td>00.02</td>
521
- <td>03.79</td>
522
- <td>03.02</td>
523
- </tr>
524
- <tr>
525
- <td><a href="https://huggingface.co/inceptionai/jais-adapted-7b-chat" target="_blank">jais-adapted-7b-chat</a></td>
526
- <td>39.30</td>
527
- <td>35.19</td>
528
- <td>43.67</td>
529
- <td>52.72</td>
530
- <td>00.60</td>
531
- <td>09.43</td>
532
- <td>03.45</td>
533
- <td>25.88</td>
534
- <td>07.25</td>
535
- <td>23.21</td>
536
- <td>01.25</td>
537
- <td>02.22</td>
538
- <td>00.04</td>
539
- <td>03.24</td>
540
- <td>02.82</td>
541
- </tr>
542
- <tr>
543
- <td><a href="https://huggingface.co/inceptionai/jais-family-13b-chat" target="_blank">jais-family-13b-chat</a></td>
544
- <td>45.11</td>
545
- <td>43.90</td>
546
- <td>58.67</td>
547
- <td>41.73</td>
548
- <td>00.92</td>
549
- <td>11.71</td>
550
- <td>04.01</td>
551
- <td>28.48</td>
552
- <td>05.70</td>
553
- <td>27.24</td>
554
- <td>04.50</td>
555
- <td>22.56</td>
556
- <td>00.03</td>
557
- <td>03.57</td>
558
- <td>01.77</td>
559
- </tr>
560
- <tr>
561
- <td><a href="https://huggingface.co/inceptionai/jais-adapted-13b-chat" target="_blank">jais-adapted-13b-chat</a></td>
562
- <td>45.20</td>
563
- <td>40.65</td>
564
- <td>49.67</td>
565
- <td>66.68</td>
566
- <td>00.87</td>
567
- <td>10.52</td>
568
- <td>04.02</td>
569
- <td>25.29</td>
570
- <td>06.66</td>
571
- <td>23.46</td>
572
- <td>20.14</td>
573
- <td>47.87</td>
574
- <td>0.04</td>
575
- <td>04.77</td>
576
- <td>01.92</td>
577
- </tr>
578
- <tr>
579
- <td><a href="https://huggingface.co/FreedomIntelligence/AceGPT-7B-chat" target="_blank">AceGPT-7b-chat</a></td>
580
- <td>35.98</td>
581
- <td>36.57</td>
582
- <td>30.11</td>
583
- <td>40.23</td>
584
- <td>00.44</td>
585
- <td>11.33</td>
586
- <td>01.05</td>
587
- <td>19.24</td>
588
- <td>06.92</td>
589
- <td>36.03</td>
590
- <td>11.05</td>
591
- <td>44.55</td>
592
- <td>00.06</td>
593
- <td>04.74</td>
594
- <td>02.28</td>
595
- </tr>
596
- <tr>
597
- <td><a href="https://huggingface.co/FreedomIntelligence/AceGPT-13B-chat" target="_blank">AceGPT-13b-chat</a></td>
598
- <td>41.09</td>
599
- <td>38.35</td>
600
- <td>33.11</td>
601
- <td>59.58</td>
602
- <td>00.98</td>
603
- <td>16.70</td>
604
- <td>00.81</td>
605
- <td>20.23</td>
606
- <td>08.73</td>
607
- <td>40.76</td>
608
- <td>14.02</td>
609
- <td>48.28</td>
610
- <td>00.12</td>
611
- <td>06.32</td>
612
- <td>02.80</td>
613
- </tr>
614
- <tr>
615
- <td><a href="https://huggingface.co/google/gemma-2-9b-it" target="_blank">gemma-2-9b-it</a></td>
616
- <td>35.91</td>
617
- <td>42.43</td>
618
- <td>31.00</td>
619
- <td>59.87</td>
620
- <td>03.10</td>
621
- <td>19.16</td>
622
- <td>01.72</td>
623
- <td>24.35</td>
624
- <td>05.18</td>
625
- <td>36.96</td>
626
- <td>08.23</td>
627
- <td>43.57</td>
628
- <td>00.17</td>
629
- <td>09.14</td>
630
- <td>13.81</td>
631
- </tr>
632
- <tr>
633
- <td><a href="meta-llama/Meta-Llama-3.1-8B-Instruct" target="_blank">Llama-3.1-8B-Instruct</a></td>
634
- <td>44.13</td>
635
- <td>38.24</td>
636
- <td>47.00</td>
637
- <td>44.08</td>
638
- <td>00.92</td>
639
- <td>14.19</td>
640
- <td>01.46</td>
641
- <td>23.82</td>
642
- <td>08.89</td>
643
- <td>33.08</td>
644
- <td>11.85</td>
645
- <td>35.51</td>
646
- <td>00.11</td>
647
- <td>06.02</td>
648
- <td>01.28</td>
649
- </tr>
650
- <tr>
651
- <td><strong><a href="https://huggingface.co/MBZUAI-Paris/Atlas-Chat-9B" target="_blank">Atlas-Chat-9B</a></strong></td>
652
- <td><b>58.23</td>
653
- <td><b>57.75</td>
654
- <td><b>74.56</td>
655
- <td><b>81.89</td>
656
- <td><b>28.08</td>
657
- <td><b>50.48</td>
658
- <td><b>18.16</td>
659
- <td><b>43.91</td>
660
- <td><b>18.63</td>
661
- <td><b>47.53</td>
662
- <td><b>29.98</td>
663
- <td><b>58.26</td>
664
- <td><b>22.08</td>
665
- <td><b>34.17</td>
666
- <td><b>59.76</td>
667
- </tr>
668
- <tr style="border-top: 4px solid;"></tr>
669
- <tr>
670
- <td><a href="https://huggingface.co/inceptionai/jais-family-30b-8k-chat" target="_blank">jais-family-30b-8k-chat</a></td>
671
- <td>51.88</td>
672
- <td>35.61</td>
673
- <td>65.67</td>
674
- <td>56.73</td>
675
- <td>01.10</td>
676
- <td>14.40</td>
677
- <td>01.67</td>
678
- <td>23.37</td>
679
- <td>08.52</td>
680
- <td>35.41</td>
681
- <td>13.71</td>
682
- <td>41.33</td>
683
- <td>00.05</td>
684
- <td>04.48</td>
685
- <td>00.46</td>
686
- </tr>
687
- <tr>
688
- <td><a href="https://huggingface.co/google/gemma-2-27b-it" target="_blank">gemma-2-27b-it</a></td>
689
- <td>36.47</td>
690
- <td>37.04</td>
691
- <td>35.78</td>
692
- <td>57.59</td>
693
- <td>00.67</td>
694
- <td>13.04</td>
695
- <td>01.74</td>
696
- <td>24.63</td>
697
- <td>05.17</td>
698
- <td>37.08</td>
699
- <td>07.36</td>
700
- <td>42.49</td>
701
- <td>00.03</td>
702
- <td>04.94</td>
703
- <td>11.10</td>
704
- </tr>
705
- <tr>
706
- <td><strong><a href="https://huggingface.co/MBZUAI-Paris/Atlas-Chat-27B" target="_blank">Atlas-Chat-27B</a></strong></td>
707
- <td><b>61.95</td>
708
- <td><b>48.37</td>
709
- <td><b>75.67</td>
710
- <td>73.00</td>
711
- <td><b>29.55</td>
712
- <td><b>51.74</td>
713
- <td><b>19.66</td>
714
- <td><b>45.65</td>
715
- <td><b>20.34</td>
716
- <td><b>49.19</td>
717
- <td><b>31.61</td>
718
- <td><b>59.37</td>
719
- <td><b>33.03</td>
720
- <td><b>40.95</td>
721
- <td><b>60.70</td>
722
- </tr>
723
-
724
-
725
-
726
- </table>
727
- -->
728
-
729
  ## Evaluation
730
  The Atlas-Chat models were evaluated on a comprehensive suite of tasks using various datasets and benchmarks to assess their performance across multiple dimensions. These included tasks such as:
731
 
@@ -752,14 +380,14 @@ The models were compared against a collection of existing open-source Arabic mod
752
  <tr>
753
  <td><a href="https://huggingface.co/inceptionai/jais-family-1p3b-chat" target="_blank">jais-family-1p3b-chat</a></td>
754
  <td>35.39</td>
755
- <td>32.51</td>
756
  <td>38.33</td>
757
  <td>35.56</td>
758
  </tr>
759
  <tr>
760
  <td><a href="https://huggingface.co/inceptionai/jais-family-2p7b-chat" target="_blank">jais-family-2p7b-chat</a></td>
761
  <td>37.44</td>
762
- <td>34.49</td>
763
  <td>44.11</td>
764
  <td>52.97</td>
765
  </tr>
@@ -787,7 +415,7 @@ The models were compared against a collection of existing open-source Arabic mod
787
  <tr>
788
  <td><strong><a href="https://huggingface.co/MBZUAI-Paris/Atlas-Chat-2B" target="_blank">Atlas-Chat-2B</a></strong></td>
789
  <td><b>44.97</b></td>
790
- <td><b>41.48</b></td>
791
  <td><b>53.89</b></td>
792
  <td><b>92.31</b></td>
793
  </tr>
@@ -795,35 +423,35 @@ The models were compared against a collection of existing open-source Arabic mod
795
  <tr>
796
  <td><a href="https://huggingface.co/inceptionai/jais-family-6p7b-chat" target="_blank">jais-family-6p7b-chat</a></td>
797
  <td>39.96</td>
798
- <td>41.57</td>
799
  <td>51.22</td>
800
  <td>65.18</td>
801
  </tr>
802
  <tr>
803
  <td><a href="https://huggingface.co/inceptionai/jais-adapted-7b-chat" target="_blank">jais-adapted-7b-chat</a></td>
804
  <td>39.30</td>
805
- <td>35.19</td>
806
  <td>43.67</td>
807
  <td>61.84</td>
808
  </tr>
809
  <tr>
810
  <td><a href="https://huggingface.co/inceptionai/jais-family-13b-chat" target="_blank">jais-family-13b-chat</a></td>
811
  <td>45.11</td>
812
- <td>43.90</td>
813
  <td>58.67</td>
814
  <td>69.93</td>
815
  </tr>
816
  <tr>
817
  <td><a href="https://huggingface.co/inceptionai/jais-adapted-13b-chat" target="_blank">jais-adapted-13b-chat</a></td>
818
  <td>45.20</td>
819
- <td>40.65</td>
820
  <td>49.67</td>
821
  <td>77.52</td>
822
  </tr>
823
  <tr>
824
  <td><a href="https://huggingface.co/FreedomIntelligence/AceGPT-7B-chat" target="_blank">AceGPT-7b-chat</a></td>
825
  <td>35.98</td>
826
- <td>36.57</td>
827
  <td>30.11</td>
828
  <td>47.31</td>
829
  </tr>
@@ -837,21 +465,21 @@ The models were compared against a collection of existing open-source Arabic mod
837
  <tr>
838
  <td><a href="https://huggingface.co/google/gemma-2-9b-it" target="_blank">gemma-2-9b-it</a></td>
839
  <td>35.91</td>
840
- <td>42.43</td>
841
  <td>31.00</td>
842
  <td>90.86</td>
843
  </tr>
844
  <tr>
845
  <td><a href="meta-llama/Meta-Llama-3.1-8B-Instruct" target="_blank">Llama-3.1-8B-Instruct</a></td>
846
  <td>44.13</td>
847
- <td>38.24</td>
848
  <td>47.00</td>
849
  <td>78.08</td>
850
  </tr>
851
  <tr>
852
  <td><strong><a href="https://huggingface.co/MBZUAI-Paris/Atlas-Chat-9B" target="_blank">Atlas-Chat-9B</a></strong></td>
853
  <td><b>58.23</b></td>
854
- <td><b>57.75</b></td>
855
  <td><b>74.56</b></td>
856
  <td><b>95.62</b></td>
857
  </tr>
 
354
  Atlas-Chat models are based on Gemma 2 models. The Atlas-Chat models were trained using 8 Nvidia's A100 80 GB GPUs in parallel using FSDP on AWS Sagemaker. The model is trained using HuggingFace transformers and parameter-efficient fine-tuning with LoRA rank of 256.
355
 
356
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
357
  ## Evaluation
358
  The Atlas-Chat models were evaluated on a comprehensive suite of tasks using various datasets and benchmarks to assess their performance across multiple dimensions. These included tasks such as:
359
 
 
380
  <tr>
381
  <td><a href="https://huggingface.co/inceptionai/jais-family-1p3b-chat" target="_blank">jais-family-1p3b-chat</a></td>
382
  <td>35.39</td>
383
+ <td>27.71</td>
384
  <td>38.33</td>
385
  <td>35.56</td>
386
  </tr>
387
  <tr>
388
  <td><a href="https://huggingface.co/inceptionai/jais-family-2p7b-chat" target="_blank">jais-family-2p7b-chat</a></td>
389
  <td>37.44</td>
390
+ <td>29.10</td>
391
  <td>44.11</td>
392
  <td>52.97</td>
393
  </tr>
 
415
  <tr>
416
  <td><strong><a href="https://huggingface.co/MBZUAI-Paris/Atlas-Chat-2B" target="_blank">Atlas-Chat-2B</a></strong></td>
417
  <td><b>44.97</b></td>
418
+ <td><b>35.08</b></td>
419
  <td><b>53.89</b></td>
420
  <td><b>92.31</b></td>
421
  </tr>
 
423
  <tr>
424
  <td><a href="https://huggingface.co/inceptionai/jais-family-6p7b-chat" target="_blank">jais-family-6p7b-chat</a></td>
425
  <td>39.96</td>
426
+ <td>32.64</td>
427
  <td>51.22</td>
428
  <td>65.18</td>
429
  </tr>
430
  <tr>
431
  <td><a href="https://huggingface.co/inceptionai/jais-adapted-7b-chat" target="_blank">jais-adapted-7b-chat</a></td>
432
  <td>39.30</td>
433
+ <td>29.55</td>
434
  <td>43.67</td>
435
  <td>61.84</td>
436
  </tr>
437
  <tr>
438
  <td><a href="https://huggingface.co/inceptionai/jais-family-13b-chat" target="_blank">jais-family-13b-chat</a></td>
439
  <td>45.11</td>
440
+ <td>33.98</td>
441
  <td>58.67</td>
442
  <td>69.93</td>
443
  </tr>
444
  <tr>
445
  <td><a href="https://huggingface.co/inceptionai/jais-adapted-13b-chat" target="_blank">jais-adapted-13b-chat</a></td>
446
  <td>45.20</td>
447
+ <td>32.84</td>
448
  <td>49.67</td>
449
  <td>77.52</td>
450
  </tr>
451
  <tr>
452
  <td><a href="https://huggingface.co/FreedomIntelligence/AceGPT-7B-chat" target="_blank">AceGPT-7b-chat</a></td>
453
  <td>35.98</td>
454
+ <td>30.33</td>
455
  <td>30.11</td>
456
  <td>47.31</td>
457
  </tr>
 
465
  <tr>
466
  <td><a href="https://huggingface.co/google/gemma-2-9b-it" target="_blank">gemma-2-9b-it</a></td>
467
  <td>35.91</td>
468
+ <td>32.19</td>
469
  <td>31.00</td>
470
  <td>90.86</td>
471
  </tr>
472
  <tr>
473
  <td><a href="meta-llama/Meta-Llama-3.1-8B-Instruct" target="_blank">Llama-3.1-8B-Instruct</a></td>
474
  <td>44.13</td>
475
+ <td>31.40</td>
476
  <td>47.00</td>
477
  <td>78.08</td>
478
  </tr>
479
  <tr>
480
  <td><strong><a href="https://huggingface.co/MBZUAI-Paris/Atlas-Chat-9B" target="_blank">Atlas-Chat-9B</a></strong></td>
481
  <td><b>58.23</b></td>
482
+ <td><b>43.65</b></td>
483
  <td><b>74.56</b></td>
484
  <td><b>95.62</b></td>
485
  </tr>