OPEA
/

SmallThinker-3B-Preview-int4-sym-gguf-q4-0-inc

Model card Files Files and versions

cicdatopea commited on Jan 13

Commit

c1a45e8

·

verified ·

1 Parent(s): 195b181

Update README.md

Files changed (1) hide show

README.md +29 -0

README.md CHANGED Viewed

@@ -142,6 +142,35 @@ text="Which number is bigger, 9.11 or 9.8?"
 # But
 ```
 ### Generate the model
 Here is the sample command to generate the model.

 # But
 ```
+### Evaluate the model
+pip3 install lm-eval==0.4.5
+```bash
+auto-round --model "OPEA/Qwen2.5-7B-Instruct-int4-inc" --eval --eval_bs 16  --tasks leaderboard_ifeval,leaderboard_mmlu_pro,gsm8k,lambada_openai,hellaswag,piqa,winogrande,truthfulqa_mc1,openbookqa,boolq,arc_easy,arc_challenge,cmmlu,ceval-valid
+```
+| Metric                                     |  BF16  |  INT4  |
+| :----------------------------------------- | :----: | :----: |
+| Avg                                        | 0.5594 | 0.5556 |
+| leaderboard_mmlu_pro 5 shots               | 0.3157 | 0.2926 |
+| leaderboard_ifeval inst_level_strict_acc   | 0.5132 | 0.5036 |
+| leaderboard_ifeval prompt_level_strict_acc | 0.3678 | 0.3512 |
+| mmlu                                       | 0.6134 | 0.6105 |
+| cmmlu                                      | 0.6685 | 0.6471 |
+| ceval-valid                                | 0.6664 | 0.6412 |
+| gsm8k 5 shots                              | 0.7377 | 0.7779 |
+| lambada_openai                             | 0.5911 | 0.5890 |
+| hellaswag                                  | 0.5384 | 0.5314 |
+| winogrande                                 | 0.6472 | 0.6504 |
+| piqa                                       | 0.7492 | 0.7486 |
+| truthfulqa_mc1                             | 0.3550 | 0.3501 |
+| openbookqa                                 | 0.2940 | 0.2906 |
+| boolq                                      | 0.7713 | 0.7700 |
+| arc_easy                                   | 0.7226 | 0.7239 |
+| arc_challenge                              | 0.3985 | 0.4113 |
 ### Generate the model
 Here is the sample command to generate the model.