Update README.md
Browse files
README.md
CHANGED
@@ -102,7 +102,6 @@ Evaluated by
|
|
102 |
|
103 |
Introduction about the M3Exam
|
104 |
|
105 |
-
|
106 |
<!-- | Qwen-7b-chat | 33.91 | 60.85 | 29.57 | 0.00 | 18.04
|
107 |
| Qwen-13b-v3-pro | 75.30 | 89.27 | 56.68 | 49.46 | 39.35
|
108 |
| Qwen-13b-v3-pro-SFT | 38.20 | 4.23 | 46.39 | 33.97 | 19.79
|
@@ -136,14 +135,14 @@ Introduction about the M3Exam
|
|
136 |
| SeaLLM-13bChat/SFT/v2 | 62.23 | 41.00 | 47.23 | 35.10 | 30.77 -->
|
137 |
|
138 |
|
139 |
-
### MMLU -
|
140 |
|
141 |
-
|
|
142 |
|-----------| ------- | ------- | ------- | ------- | ------- |
|
143 |
-
| Llama-2
|
144 |
-
| Llama-2-
|
145 |
-
| SeaLLM-13bChat/SFT/
|
146 |
-
| SeaLLM-13bChat/SFT/
|
147 |
|
148 |
|
149 |
### NLP tasks
|
@@ -162,7 +161,7 @@ Read-Comphrension | En | Zh | Vi | Id | Th | ALL | SEA
|
|
162 |
|
163 |
#### Translation
|
164 |
|
165 |
-
|
166 |
|
167 |
Model | En-Zh | En-Vi | En-Id | En-Th | En->X | Zh-En | Vi-En | Id-En | Th-En | X->En
|
168 |
|-------- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- |
|
@@ -171,6 +170,13 @@ Model | En-Zh | En-Vi | En-Id | En-Th | En->X | Zh-En | Vi-En | Id-En | Th-En |
|
|
171 |
| SeaLLM-13b-chat-v1 | 22.77 | 58.96 | 64.78 | 42.38 | 55.37 | 53.20 | 60.29 | 65.03 | 57.24 | 60.85
|
172 |
| SeaLLM-13b-chat-v2 | 22.75 | 58.78 | 65.90 | 42.60 | 55.76 | 53.34 | 60.80 | 65.44 | 57.05 | 61.10
|
173 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
174 |
|
175 |
#### Summarization
|
176 |
|
@@ -194,3 +200,4 @@ If you find our project useful, hope you can star our repo and cite our work as
|
|
194 |
year = 2023,
|
195 |
}
|
196 |
```
|
|
|
|
102 |
|
103 |
Introduction about the M3Exam
|
104 |
|
|
|
105 |
<!-- | Qwen-7b-chat | 33.91 | 60.85 | 29.57 | 0.00 | 18.04
|
106 |
| Qwen-13b-v3-pro | 75.30 | 89.27 | 56.68 | 49.46 | 39.35
|
107 |
| Qwen-13b-v3-pro-SFT | 38.20 | 4.23 | 46.39 | 33.97 | 19.79
|
|
|
135 |
| SeaLLM-13bChat/SFT/v2 | 62.23 | 41.00 | 47.23 | 35.10 | 30.77 -->
|
136 |
|
137 |
|
138 |
+
### MMLU - Preserving English-based knowledge
|
139 |
|
140 |
+
| 13B Models | STEM | Humanities | Social | Others | Average
|
141 |
|-----------| ------- | ------- | ------- | ------- | ------- |
|
142 |
+
| Llama-2 | 44.10 | 52.80 | 62.60 | 61.10 | 54.80
|
143 |
+
| Llama-2-chat | 43.70 | 49.30 | 62.60 | 60.10 | 53.50
|
144 |
+
| SeaLLM-13bChat/SFT/v2 | 43.67 | 52.09 | 62.69 | 61.20 | 54.70
|
145 |
+
| SeaLLM-13bChat/SFT/v3 | 43.30 | 52.80 | 63.10 | 61.20 | 55.00
|
146 |
|
147 |
|
148 |
### NLP tasks
|
|
|
161 |
|
162 |
#### Translation
|
163 |
|
164 |
+
Translation between SEA-En. Scores in chrF++
|
165 |
|
166 |
Model | En-Zh | En-Vi | En-Id | En-Th | En->X | Zh-En | Vi-En | Id-En | Th-En | X->En
|
167 |
|-------- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- |
|
|
|
170 |
| SeaLLM-13b-chat-v1 | 22.77 | 58.96 | 64.78 | 42.38 | 55.37 | 53.20 | 60.29 | 65.03 | 57.24 | 60.85
|
171 |
| SeaLLM-13b-chat-v2 | 22.75 | 58.78 | 65.90 | 42.60 | 55.76 | 53.34 | 60.80 | 65.44 | 57.05 | 61.10
|
172 |
|
173 |
+
Translation between SEA-SEA
|
174 |
+
|
175 |
+
Model | Vi-Id | Id-Vi | Vi-Th | Th-Vi | Id-Th | Th-Id
|
176 |
+
|-------- | ---- | ---- | ---- | ---- | ---- | ---- |
|
177 |
+
ChatGPT | 56.75 | 54.17 | 40.48 | 46.54 | 40.59 | 51.87
|
178 |
+
SeaLLM-13b-base mixed SFT | 54.56 | 54.76 | 36.68 | 51.88 | 39.36 | 47.99
|
179 |
+
SeaLLM-13b-Chat/SFT/v2 | 53.75 | 52.47 | 32.76 | 49.20 | 40.43 | 50.03
|
180 |
|
181 |
#### Summarization
|
182 |
|
|
|
200 |
year = 2023,
|
201 |
}
|
202 |
```
|
203 |
+
|