Update README.md
Browse files
README.md
CHANGED
@@ -52,7 +52,7 @@ This is an update over the June 2024 instruction-tuned Phi-3 Mini release based
|
|
52 |
|
53 |
The table below highlights multilingual capability of the Phi-3.5 Mini on multilingual MMLU, MEGA, and multilingual MMLU-pro datasets. Overall, we observed that even with just 3.8B active parameters, the model is competitive on multilingual tasks in comparison to other models with a much bigger active parameters.
|
54 |
|
55 |
-
| Benchmark | Phi-3.5 Mini-Ins | Phi-3-Mini-
|
56 |
|----------------------------|------------------|-----------------------|--------------------------|---------------------------|------------------|----------------|------------------|-------------------------------|
|
57 |
| Multilingual MMLU | 55.4 | 51.08 | 47.4 | 58.9 | 56.2 | 63.8 | 77.2 | 72.9 |
|
58 |
| Multilingual MMLU-Pro | 30.9 | 30.21 | 15.0 | 34.0 | 21.4 | 43.0 | 57.9 | 53.2 |
|
@@ -66,7 +66,7 @@ The table below highlights multilingual capability of the Phi-3.5 Mini on multil
|
|
66 |
|
67 |
The table below shows Multilingual MMLU scores in some of the supported languages. For more multi-lingual benchmarks and details, see [Appendix A](#appendix-a).
|
68 |
|
69 |
-
| Benchmark | Phi-3.5 Mini-Ins | Phi-3-Mini-
|
70 |
|-----------|------------------|-----------------------|--------------------------|---------------------------|------------------|----------------|------------------|-------------------------------|
|
71 |
| Arabic | 44.2 | 35.4 | 33.7 | 45.3 | 49.1 | 56.3 | 73.6 | 67.1 |
|
72 |
| Chinese | 52.6 | 46.9 | 45.9 | 58.2 | 54.4 | 62.7 | 66.7 | 70.8 |
|
|
|
52 |
|
53 |
The table below highlights multilingual capability of the Phi-3.5 Mini on multilingual MMLU, MEGA, and multilingual MMLU-pro datasets. Overall, we observed that even with just 3.8B active parameters, the model is competitive on multilingual tasks in comparison to other models with a much bigger active parameters.
|
54 |
|
55 |
+
| Benchmark | Phi-3.5 Mini-Ins | Phi-3.0-Mini-128k-Instruct (June2024) | Mistral-7B-Instruct-v0.3 | Mistral-Nemo-12B-Ins-2407 | Llama-3.1-8B-Ins | Gemma-2-9B-Ins | Gemini 1.5 Flash | GPT-4o-mini-2024-07-18 (Chat) |
|
56 |
|----------------------------|------------------|-----------------------|--------------------------|---------------------------|------------------|----------------|------------------|-------------------------------|
|
57 |
| Multilingual MMLU | 55.4 | 51.08 | 47.4 | 58.9 | 56.2 | 63.8 | 77.2 | 72.9 |
|
58 |
| Multilingual MMLU-Pro | 30.9 | 30.21 | 15.0 | 34.0 | 21.4 | 43.0 | 57.9 | 53.2 |
|
|
|
66 |
|
67 |
The table below shows Multilingual MMLU scores in some of the supported languages. For more multi-lingual benchmarks and details, see [Appendix A](#appendix-a).
|
68 |
|
69 |
+
| Benchmark | Phi-3.5 Mini-Ins | Phi-3.0-Mini-128k-Instruct (June2024) | Mistral-7B-Instruct-v0.3 | Mistral-Nemo-12B-Ins-2407 | Llama-3.1-8B-Ins | Gemma-2-9B-Ins | Gemini 1.5 Flash | GPT-4o-mini-2024-07-18 (Chat) |
|
70 |
|-----------|------------------|-----------------------|--------------------------|---------------------------|------------------|----------------|------------------|-------------------------------|
|
71 |
| Arabic | 44.2 | 35.4 | 33.7 | 45.3 | 49.1 | 56.3 | 73.6 | 67.1 |
|
72 |
| Chinese | 52.6 | 46.9 | 45.9 | 58.2 | 54.4 | 62.7 | 66.7 | 70.8 |
|