zake7749 commited on
Commit
d98f1a3
·
verified ·
1 Parent(s): 6f440ab

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -1
README.md CHANGED
@@ -42,4 +42,5 @@ All evaluations are conducted in a zero-shot setting.
42
  | **[CRUX](https://github.com/yuchenlin/ZeroEval)** | **49.25** | 46.00 |
43
  | **[MT-Bench](https://huggingface.co/spaces/lmsys/mt-bench)** | **8.81** | 8.53 |
44
  | **[MT-Bench-TW](https://huggingface.co/datasets/MediaTek-Research/TCEval-v2)** | **8.36** | 7.80 |
45
- | **[Chatbot-Arena-Hard](https://github.com/lmarena/arena-hard-auto)** | **43.90** | 33.60 |
 
 
42
  | **[CRUX](https://github.com/yuchenlin/ZeroEval)** | **49.25** | 46.00 |
43
  | **[MT-Bench](https://huggingface.co/spaces/lmsys/mt-bench)** | **8.81** | 8.53 |
44
  | **[MT-Bench-TW](https://huggingface.co/datasets/MediaTek-Research/TCEval-v2)** | **8.36** | 7.80 |
45
+ | **[Chatbot-Arena-Hard](https://github.com/lmarena/arena-hard-auto)** | **43.90** | 33.60 |
46
+ | **[AlignBench](https://github.com/THUDM/AlignBench)** | **7.25** | 6.88 |