microsoft
/

Phi-3.5-mini-instruct

@@ -52,7 +52,7 @@ This is an update over the June 2024 instruction-tuned Phi-3 Mini release based
 The table below highlights multilingual capability of the Phi-3.5 Mini on multilingual MMLU, MEGA, and multilingual MMLU-pro datasets. Overall, we observed that even with just 3.8B active parameters, the model is  competitive on multilingual tasks in comparison to other models with a much bigger active parameters.
-| Benchmark                  | Phi-3.5 Mini-Ins | Phi-3-Mini-128K-Ins (June 2024) | Mistral-7B-Instruct-v0.3 | Mistral-Nemo-12B-Ins-2407 | Llama-3.1-8B-Ins | Gemma-2-9B-Ins | Gemini 1.5 Flash | GPT-4o-mini-2024-07-18 (Chat) |
 |----------------------------|------------------|-----------------------|--------------------------|---------------------------|------------------|----------------|------------------|-------------------------------|
 | Multilingual MMLU          | 55.4             | 51.08                 | 47.4                     | 58.9                      | 56.2             | 63.8           | 77.2             | 72.9                          |
 | Multilingual MMLU-Pro      | 30.9             | 30.21                 | 15.0                     | 34.0                      | 21.4             | 43.0           | 57.9             | 53.2                          |
@@ -66,7 +66,7 @@ The table below highlights multilingual capability of the Phi-3.5 Mini on multil
 The table below shows Multilingual MMLU scores in some of the supported languages. For more multi-lingual benchmarks and details, see [Appendix A](#appendix-a).
-| Benchmark | Phi-3.5 Mini-Ins | Phi-3-Mini-128K-Ins (June 2024) | Mistral-7B-Instruct-v0.3 | Mistral-Nemo-12B-Ins-2407 | Llama-3.1-8B-Ins | Gemma-2-9B-Ins | Gemini 1.5 Flash | GPT-4o-mini-2024-07-18 (Chat) |
 |-----------|------------------|-----------------------|--------------------------|---------------------------|------------------|----------------|------------------|-------------------------------|
 | Arabic    | 44.2             | 35.4                  | 33.7                     | 45.3                      | 49.1             | 56.3           | 73.6             | 67.1                          |
 | Chinese   | 52.6             | 46.9                  | 45.9                     | 58.2                      | 54.4             | 62.7           | 66.7             | 70.8                          |

 The table below highlights multilingual capability of the Phi-3.5 Mini on multilingual MMLU, MEGA, and multilingual MMLU-pro datasets. Overall, we observed that even with just 3.8B active parameters, the model is  competitive on multilingual tasks in comparison to other models with a much bigger active parameters.
+| Benchmark                  | Phi-3.5 Mini-Ins | Phi-3.0-Mini-128k-Instruct (June2024) | Mistral-7B-Instruct-v0.3 | Mistral-Nemo-12B-Ins-2407 | Llama-3.1-8B-Ins | Gemma-2-9B-Ins | Gemini 1.5 Flash | GPT-4o-mini-2024-07-18 (Chat) |
 |----------------------------|------------------|-----------------------|--------------------------|---------------------------|------------------|----------------|------------------|-------------------------------|
 | Multilingual MMLU          | 55.4             | 51.08                 | 47.4                     | 58.9                      | 56.2             | 63.8           | 77.2             | 72.9                          |
 | Multilingual MMLU-Pro      | 30.9             | 30.21                 | 15.0                     | 34.0                      | 21.4             | 43.0           | 57.9             | 53.2                          |
 The table below shows Multilingual MMLU scores in some of the supported languages. For more multi-lingual benchmarks and details, see [Appendix A](#appendix-a).
+| Benchmark | Phi-3.5 Mini-Ins | Phi-3.0-Mini-128k-Instruct (June2024) | Mistral-7B-Instruct-v0.3 | Mistral-Nemo-12B-Ins-2407 | Llama-3.1-8B-Ins | Gemma-2-9B-Ins | Gemini 1.5 Flash | GPT-4o-mini-2024-07-18 (Chat) |
 |-----------|------------------|-----------------------|--------------------------|---------------------------|------------------|----------------|------------------|-------------------------------|
 | Arabic    | 44.2             | 35.4                  | 33.7                     | 45.3                      | 49.1             | 56.3           | 73.6             | 67.1                          |
 | Chinese   | 52.6             | 46.9                  | 45.9                     | 58.2                      | 54.4             | 62.7           | 66.7             | 70.8                          |