Update README.md
Browse files
README.md
CHANGED
@@ -42,4 +42,5 @@ All evaluations are conducted in a zero-shot setting.
|
|
42 |
| **[CRUX](https://github.com/yuchenlin/ZeroEval)** | **49.25** | 46.00 |
|
43 |
| **[MT-Bench](https://huggingface.co/spaces/lmsys/mt-bench)** | **8.81** | 8.53 |
|
44 |
| **[MT-Bench-TW](https://huggingface.co/datasets/MediaTek-Research/TCEval-v2)** | **8.36** | 7.80 |
|
45 |
-
| **[Chatbot-Arena-Hard](https://github.com/lmarena/arena-hard-auto)** | **43.90** | 33.60 |
|
|
|
|
42 |
| **[CRUX](https://github.com/yuchenlin/ZeroEval)** | **49.25** | 46.00 |
|
43 |
| **[MT-Bench](https://huggingface.co/spaces/lmsys/mt-bench)** | **8.81** | 8.53 |
|
44 |
| **[MT-Bench-TW](https://huggingface.co/datasets/MediaTek-Research/TCEval-v2)** | **8.36** | 7.80 |
|
45 |
+
| **[Chatbot-Arena-Hard](https://github.com/lmarena/arena-hard-auto)** | **43.90** | 33.60 |
|
46 |
+
| **[AlignBench](https://github.com/THUDM/AlignBench)** | **7.25** | 6.88 |
|