OPEA
/

GGUF
Inference Endpoints
conversational
cicdatopea commited on
Commit
c1a45e8
·
verified ·
1 Parent(s): 195b181

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +29 -0
README.md CHANGED
@@ -142,6 +142,35 @@ text="Which number is bigger, 9.11 or 9.8?"
142
  # But
143
  ```
144
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
145
  ### Generate the model
146
 
147
  Here is the sample command to generate the model.
 
142
  # But
143
  ```
144
 
145
+ ### Evaluate the model
146
+
147
+ pip3 install lm-eval==0.4.5
148
+
149
+ ```bash
150
+ auto-round --model "OPEA/Qwen2.5-7B-Instruct-int4-inc" --eval --eval_bs 16 --tasks leaderboard_ifeval,leaderboard_mmlu_pro,gsm8k,lambada_openai,hellaswag,piqa,winogrande,truthfulqa_mc1,openbookqa,boolq,arc_easy,arc_challenge,cmmlu,ceval-valid
151
+ ```
152
+
153
+ | Metric | BF16 | INT4 |
154
+ | :----------------------------------------- | :----: | :----: |
155
+ | Avg | 0.5594 | 0.5556 |
156
+ | leaderboard_mmlu_pro 5 shots | 0.3157 | 0.2926 |
157
+ | leaderboard_ifeval inst_level_strict_acc | 0.5132 | 0.5036 |
158
+ | leaderboard_ifeval prompt_level_strict_acc | 0.3678 | 0.3512 |
159
+ | mmlu | 0.6134 | 0.6105 |
160
+ | cmmlu | 0.6685 | 0.6471 |
161
+ | ceval-valid | 0.6664 | 0.6412 |
162
+ | gsm8k 5 shots | 0.7377 | 0.7779 |
163
+ | lambada_openai | 0.5911 | 0.5890 |
164
+ | hellaswag | 0.5384 | 0.5314 |
165
+ | winogrande | 0.6472 | 0.6504 |
166
+ | piqa | 0.7492 | 0.7486 |
167
+ | truthfulqa_mc1 | 0.3550 | 0.3501 |
168
+ | openbookqa | 0.2940 | 0.2906 |
169
+ | boolq | 0.7713 | 0.7700 |
170
+ | arc_easy | 0.7226 | 0.7239 |
171
+ | arc_challenge | 0.3985 | 0.4113 |
172
+
173
+
174
  ### Generate the model
175
 
176
  Here is the sample command to generate the model.