cicdatopea
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -142,6 +142,35 @@ text="Which number is bigger, 9.11 or 9.8?"
|
|
142 |
# But
|
143 |
```
|
144 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
145 |
### Generate the model
|
146 |
|
147 |
Here is the sample command to generate the model.
|
|
|
142 |
# But
|
143 |
```
|
144 |
|
145 |
+
### Evaluate the model
|
146 |
+
|
147 |
+
pip3 install lm-eval==0.4.5
|
148 |
+
|
149 |
+
```bash
|
150 |
+
auto-round --model "OPEA/Qwen2.5-7B-Instruct-int4-inc" --eval --eval_bs 16 --tasks leaderboard_ifeval,leaderboard_mmlu_pro,gsm8k,lambada_openai,hellaswag,piqa,winogrande,truthfulqa_mc1,openbookqa,boolq,arc_easy,arc_challenge,cmmlu,ceval-valid
|
151 |
+
```
|
152 |
+
|
153 |
+
| Metric | BF16 | INT4 |
|
154 |
+
| :----------------------------------------- | :----: | :----: |
|
155 |
+
| Avg | 0.5594 | 0.5556 |
|
156 |
+
| leaderboard_mmlu_pro 5 shots | 0.3157 | 0.2926 |
|
157 |
+
| leaderboard_ifeval inst_level_strict_acc | 0.5132 | 0.5036 |
|
158 |
+
| leaderboard_ifeval prompt_level_strict_acc | 0.3678 | 0.3512 |
|
159 |
+
| mmlu | 0.6134 | 0.6105 |
|
160 |
+
| cmmlu | 0.6685 | 0.6471 |
|
161 |
+
| ceval-valid | 0.6664 | 0.6412 |
|
162 |
+
| gsm8k 5 shots | 0.7377 | 0.7779 |
|
163 |
+
| lambada_openai | 0.5911 | 0.5890 |
|
164 |
+
| hellaswag | 0.5384 | 0.5314 |
|
165 |
+
| winogrande | 0.6472 | 0.6504 |
|
166 |
+
| piqa | 0.7492 | 0.7486 |
|
167 |
+
| truthfulqa_mc1 | 0.3550 | 0.3501 |
|
168 |
+
| openbookqa | 0.2940 | 0.2906 |
|
169 |
+
| boolq | 0.7713 | 0.7700 |
|
170 |
+
| arc_easy | 0.7226 | 0.7239 |
|
171 |
+
| arc_challenge | 0.3985 | 0.4113 |
|
172 |
+
|
173 |
+
|
174 |
### Generate the model
|
175 |
|
176 |
Here is the sample command to generate the model.
|