mgoin commited on
Commit
4eb2935
·
verified ·
1 Parent(s): 736010e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -18,13 +18,13 @@ Model evaluation results obtained via [lm-evaluation-harness](https://github.com
18
  | Benchmark | Meta-Llama-3-8B | Meta-Llama-3-8B-pruned_50.2of4 | Meta-Llama-3-8B-pruned_50.2of4-FP8<br>(this model) |
19
  |:----------------------------------------------:|:-----------:|:-----------------------------:|:-----------------------------:|
20
  | [ARC-c](https://arxiv.org/abs/1911.01547)<br> 25-shot | 59.47% | 57.76% | 58.02% |
21
- | [MMLU](https://arxiv.org/abs/2009.03300)<br> 5-shot | 65.29% | 60.44% | xxxxxx |
22
  | [HellaSwag](https://arxiv.org/abs/1905.07830)<br> 10-shot | 82.14% | 79.97% | 79.61% |
23
  | [WinoGrande](https://arxiv.org/abs/1907.10641)<br> 5-shot | 77.27% | 77.19% | 76.32% |
24
  | [GSM8K](https://arxiv.org/abs/2110.14168)<br> 5-shot | 44.81% | 47.92% | 49.36% |
25
  | [TruthfulQA](https://arxiv.org/abs/2109.07958)<br> 0-shot | 43.96% | 41.02% | 40.82% |
26
- | **Average<br>Accuracy** | **62.16%** | **60.72%** | xxxxxx |
27
- | **Recovery** | **100%** | **97.68%** | xxxxxx |
28
 
29
 
30
  ## Help
 
18
  | Benchmark | Meta-Llama-3-8B | Meta-Llama-3-8B-pruned_50.2of4 | Meta-Llama-3-8B-pruned_50.2of4-FP8<br>(this model) |
19
  |:----------------------------------------------:|:-----------:|:-----------------------------:|:-----------------------------:|
20
  | [ARC-c](https://arxiv.org/abs/1911.01547)<br> 25-shot | 59.47% | 57.76% | 58.02% |
21
+ | [MMLU](https://arxiv.org/abs/2009.03300)<br> 5-shot | 65.29% | 60.44% | 60.71% |
22
  | [HellaSwag](https://arxiv.org/abs/1905.07830)<br> 10-shot | 82.14% | 79.97% | 79.61% |
23
  | [WinoGrande](https://arxiv.org/abs/1907.10641)<br> 5-shot | 77.27% | 77.19% | 76.32% |
24
  | [GSM8K](https://arxiv.org/abs/2110.14168)<br> 5-shot | 44.81% | 47.92% | 49.36% |
25
  | [TruthfulQA](https://arxiv.org/abs/2109.07958)<br> 0-shot | 43.96% | 41.02% | 40.82% |
26
+ | **Average<br>Accuracy** | **62.16%** | **60.72%** | **60.81%** |
27
+ | **Recovery** | **100%** | **97.68%** | **97.83%** |
28
 
29
 
30
  ## Help