Update README.md
Browse files
README.md
CHANGED
@@ -71,9 +71,17 @@ print(generated_text)
|
|
71 |
|
72 |
## Evaluation
|
73 |
|
74 |
-
We present the results that compares the performance of the our evolved LLMs compared to the source LLMs. To reproduce the results, please use [our Github repository](https://github.com/SakanaAI/evolving-merged-models).
|
75 |
-
|
76 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
77 |
|
78 |
|
79 |
## Citation
|
|
|
71 |
|
72 |
## Evaluation
|
73 |
|
74 |
+
We present the results on the [MGSM-JA](juletxara/mgsm) test set that compares the performance of the our evolved LLMs compared to the source LLMs. To reproduce the results, please use [our Github repository](https://github.com/SakanaAI/evolving-merged-models).
|
75 |
+
|
76 |
+
| Id. | Model | Type | Params | MGSM-JA (acc ↑ ) |
|
77 |
+
| :--: | :-- | :-- | --: | --: |
|
78 |
+
| 1 | [Shisa Gamma 7B v1](https://huggingface.co/augmxnt/shisa-gamma-7b-v1) | JA general | 7B |9.6 |
|
79 |
+
| 2 | [WizardMath 7B V1.1](https://huggingface.co/WizardLM/WizardMath-7B-V1.1) | EN math | 7B | 18.4 |
|
80 |
+
| 3 | [Abel 7B 002](https://huggingface.co/GAIR/Abel-7B-002) | EN math | 7B | 30.0 |
|
81 |
+
| 4 | [Arithmo2 Mistral 7B](https://huggingface.co/upaya07/Arithmo2-Mistral-7B) | EN math | 7B | 24.0 |
|
82 |
+
| 5 | [(Ours) EvoLLM-v1-JP-7B](https://huggingface.co/SakanaAI/EvoLLM-v1-JP-7B) | 1+2+3 | 7B | **52.0** |
|
83 |
+
| 6 | [(Ours) EvoLLM-v1-JP-7B-A](https://huggingface.co/SakanaAI/EvoLLM-v1-JP-7B-A) | 1+3+4 | 7B | **52.4** |
|
84 |
+
| 7 | [(Ours) EvoLLM-v1-JP-10B](https://huggingface.co/SakanaAI/EvoLLM-v1-JP-10B) | 1 + 5 | 10B | **55.6** |
|
85 |
|
86 |
|
87 |
## Citation
|