QCRI
/

shamz15531 commited on
Commit
604f6e3
·
verified ·
1 Parent(s): 103f2d0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -102,7 +102,7 @@ Evaluation was conducted using a modified version of the LM Evaluation Harness a
102
  | Model | MMLU (5-shot) | MMMLU (Arabic) (0-shot) | ArabicMMLU (3-shot) | HellaSwag (0-shot) | PIQA (0-shot) | ARC Challenge (0-shot) | Belebele (Arabic) (3-shot) | ACVA (5-shot) | GSM8k | OALL (0-shot) | OALL v2 (0-shot) | Almieyar Arabic (3-shot) | Arab Cultural MCQ (3-shot) | AraDiCE PIQA (MSA) (0-shot) | AraDiCE PIQA(Egy) (0-shot) | AraDiCE PIQA(Lev) (0-shot) | AraDiCE ArabicMMLU(Egy) (0-shot) | AraDiCE ArabicMMLU(Lev) (0-shot) |
103
  |-------|----------------|--------------------------|----------------------|--------------------|---------------|-------------------------|------------------------------|---------------|--------|----------------|------------------|---------------------------|-----------------------------|-------------------------------|------------------------------|------------------------------|-----------------------------------|-----------------------------------|
104
  | Fanar-1-9B | 71.33% | **57.38%** | **67.42%** | **80.76%** | 81.66% | 59.73% | **79.31%** | **81.31%** | **45.79%** | **54.94%** | **63.20%** | **77.18%** | **72.30%** | **66.00%** | **62.19%** | 57.67% | **55.79%** | **55.63%** |
105
- | AceGPT-v2-8B | 63.55% | 41.71% | 58.55% | 76.97% | 80.03% | 49.40% | 60.61% | 78.36% | 10.92% | 43.58% | - | 66.83% | 67.50% | 63.17% | 61.48% | 56.75% | 43.40% | 40.96% |
106
  | gemma-2-9b | 70.60% | 54.04% | 64.32% | 79.82% | **82.97%** | **65.53%** | 75.31% | 79.66% | 21.61% | 50.24% | 57.23% | 73.82% | 68.60% | 63.98% | 60.17% | 58.05% | 49.61% | 47.15% |
107
  | jais-adapted-13b | 50.42% | 34.01% | 51.96% | 78.02% | 78.94% | 48.55% | 43.02% | 73.52% | 5.76% | 40.79% | 40.06% | 62.34% | 60.90% | 65.02% | **62.19%** | **59.25%** | 38.24% | 37.93% |
108
  | jais-family-6p7b | 32.50% | 25.34% | 34.81% | 69.28% | 75.95% | 40.27% | 34.54% | 60.13% | 3.87% | 37.55% | 33.59% | 32.17% | 34.00% | 65.18% | 60.23% | 58.38% | 28.50% | 29.46% |
 
102
  | Model | MMLU (5-shot) | MMMLU (Arabic) (0-shot) | ArabicMMLU (3-shot) | HellaSwag (0-shot) | PIQA (0-shot) | ARC Challenge (0-shot) | Belebele (Arabic) (3-shot) | ACVA (5-shot) | GSM8k | OALL (0-shot) | OALL v2 (0-shot) | Almieyar Arabic (3-shot) | Arab Cultural MCQ (3-shot) | AraDiCE PIQA (MSA) (0-shot) | AraDiCE PIQA(Egy) (0-shot) | AraDiCE PIQA(Lev) (0-shot) | AraDiCE ArabicMMLU(Egy) (0-shot) | AraDiCE ArabicMMLU(Lev) (0-shot) |
103
  |-------|----------------|--------------------------|----------------------|--------------------|---------------|-------------------------|------------------------------|---------------|--------|----------------|------------------|---------------------------|-----------------------------|-------------------------------|------------------------------|------------------------------|-----------------------------------|-----------------------------------|
104
  | Fanar-1-9B | 71.33% | **57.38%** | **67.42%** | **80.76%** | 81.66% | 59.73% | **79.31%** | **81.31%** | **45.79%** | **54.94%** | **63.20%** | **77.18%** | **72.30%** | **66.00%** | **62.19%** | 57.67% | **55.79%** | **55.63%** |
105
+ | AceGPT-v2-8B | 63.55% | 41.71% | 58.55% | 76.97% | 80.03% | 49.40% | 60.61% | 78.36% | 10.92% | 43.58% | 47.00% | 66.83% | 67.50% | 63.17% | 61.48% | 56.75% | 43.40% | 40.96% |
106
  | gemma-2-9b | 70.60% | 54.04% | 64.32% | 79.82% | **82.97%** | **65.53%** | 75.31% | 79.66% | 21.61% | 50.24% | 57.23% | 73.82% | 68.60% | 63.98% | 60.17% | 58.05% | 49.61% | 47.15% |
107
  | jais-adapted-13b | 50.42% | 34.01% | 51.96% | 78.02% | 78.94% | 48.55% | 43.02% | 73.52% | 5.76% | 40.79% | 40.06% | 62.34% | 60.90% | 65.02% | **62.19%** | **59.25%** | 38.24% | 37.93% |
108
  | jais-family-6p7b | 32.50% | 25.34% | 34.81% | 69.28% | 75.95% | 40.27% | 34.54% | 60.13% | 3.87% | 37.55% | 33.59% | 32.17% | 34.00% | 65.18% | 60.23% | 58.38% | 28.50% | 29.46% |