SeaLLMs
/

SeaLLM-13B-Chat

Model card Files Files and versions Community

nxphi47 commited on Oct 26, 2023

Commit

0e1abcd

·

1 Parent(s): 33bee21

Update README.md

Files changed (1) hide show

README.md +9 -6

README.md CHANGED Viewed

@@ -96,12 +96,15 @@ We conduct SFT with a relatively balanced mix of SFT data from different categor
 One of the most reliable ways to compare chatbot models is peer comparison. With the help of native speakers, we built an instruction test set that focus on various aspects expected in a user-facing chatbot, namely (1) NLP tasks (e.g. translation & comprehension), (2) Reasoning, (3) Instruction-following and (4) Natural and Informal questions. The test set also covers all languages that we are concerned with.
 **Pending peer comparison**
-<!-- ! Add the stack chart better -->
-| vs ChatGPT | win | lose | tie
-| --- | --- | --- | --- |
-| Polylm-13b-chat           | 204 | 1517 | 122
-| Qwen-14b-chat             | 433 | 1128 | 306
-| SeaLLM-13bChat/SFT/v1     | 454 | 1185 | 209
 ### M3Exam - World Knowledge in Regional Languages

 One of the most reliable ways to compare chatbot models is peer comparison. With the help of native speakers, we built an instruction test set that focus on various aspects expected in a user-facing chatbot, namely (1) NLP tasks (e.g. translation & comprehension), (2) Reasoning, (3) Instruction-following and (4) Natural and Informal questions. The test set also covers all languages that we are concerned with.
 **Pending peer comparison**
+<img src="seallm_vs_chatgpt_by_lang.png" width="800" />
+<img src="seallm_vs_chatgpt_by_cat_sea.png" width="800" />
 ### M3Exam - World Knowledge in Regional Languages