pankajmathur
/

orca_mini_v2_7b

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Pankaj Mathur commited on Jul 9, 2023

Commit

c101123

•

1 Parent(s): 50686f1

Update README.md

Files changed (1) hide show

README.md +8 -22

README.md CHANGED Viewed

@@ -20,30 +20,16 @@ Please note this model has *better code generation capabilities* compare to our
 I evaluated orca_mini_v2_7b on a wide range of tasks using [Language Model Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness) from EleutherAI.
-Here are the zero shot metrics results.
-|||||||
-|:------:|:-------------:|:---------:|:--------:|:-------:|:--------:|
-|**Task**|**num_fewshot**|**Version**|**Metric**|**Value**|**Stderr**|
-|*arc_easy*|0|0|acc|0.7386|0.0090|
-|*hellaswag*|0|0|acc_norm|0.7394|0.0044|
-|*truthfulqa_mc*|0|1|mc2|0.4399|0.0153|
-|*mmlu*|0|1|acc_norm|0.4108|0.0153|
-|*Total Zero Shot Average*|0|-|-|0.5821|0.011|
 Here are the results on metrics used by [HuggingFaceH4 Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
-please note num_fewshots varies for each below task as used by HuggingFaceH4 Open LLM Leaderboard
-|||||||
-|:------:|:-------------:|:---------:|:--------:|:-------:|:--------:|
-|**Task**|**num_fewshot**|**Version**|**Metric**|**Value**|**Stderr**|
-|*arc_challenge*|25|0|acc_norm|0.5077|0.0146|
-|*hellaswag*|10|0|acc_norm|0.7617|0.0043|
-|*mmlu*|5|0|acc_norm|0.3955|0.035|
-|*truthfulqa_mc*|0|1|mc2|0.4399|0.0153|
-|*Total Average*|0|-|-|0.5262|0.0173|

 I evaluated orca_mini_v2_7b on a wide range of tasks using [Language Model Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness) from EleutherAI.
 Here are the results on metrics used by [HuggingFaceH4 Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
+||||||
+|:------:|:--------:|:-------:|:--------:|
+|**Task**|**Metric**|**Value**|**Stderr**|
+|*arc_challenge*|acc_norm|0.5077|0.0146|
+|*hellaswag*|acc_norm|0.7617|0.0043|
+|*mmlu*|acc_norm|0.3955|0.035|
+|*truthfulqa_mc*|mc2|0.4399|0.0153|
+|*Total Average*|-|0.5262|0.0173|