stijn-zyphra commited on
Commit
c684bad
·
verified ·
1 Parent(s): 9977389

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -7
README.md CHANGED
@@ -54,14 +54,14 @@ print((tokenizer.decode(outputs[0])))
54
 
55
  Zamba2-2.7B-Instruct-v2 achieves comparable performance to models of similar size.
56
 
57
- | Model | Size | IFEval | BBH | GPQA | MATH_hard | MMLU_pro | MUSR | Aggregate |
58
  |:--------------------------|:-------:|:--------:|:-------:|:------:|:-----------:|:----------:|:-------:|:-----------:|
59
- | Zamba2-2.7B-Instruct-v2 | 2.66B | 71.92 | 22.42 | 6.13 | 6.47 | 24.40 | 14.97 | 24.38 |
60
- | Zamba2-2.7B-Instruct | 2.66B | 46.56 | 21.32 | 4.09 | 5.71 | 23.18 | 8.56 | 18.24 |
61
- | Granite-3.2-2B-Instruct | 2.53B | 63.03 | 26.87 | 6.09 | 13.32 | 27.80 | 3.74 | 23.48 |
62
- | Qwen-2.5-3B-Instruct | 3.09B | 65.02 | 30.98 | 2.03 | 34.73 | 32.59 | 7.28 | 28.77 |
63
- | Llama3.2-3B-Instruct | 3.21B | 73.87 | 29.31 | 4.06 | 17.12 | 32.01 | 1.74 | 26.22 |
64
- | Gemma-2-2b-it | 2.61B | 19.76 | 24.42 | 2.58 | 1.04 | 25.80 | 7.16 | 13.46 |
65
 
66
  Moreover, due to its unique hybrid SSM architecture, Zamba2-2.7B-Instruct-v2 achieves extremely low inference latency and rapid generation with a significantly smaller memory footprint than comparable transformer-based models.
67
 
 
54
 
55
  Zamba2-2.7B-Instruct-v2 achieves comparable performance to models of similar size.
56
 
57
+ | Model | Size (B) | IFEval | BBH | GPQA | MATH (Hard) | MMLU Pro | MUSR | Aggregate |
58
  |:--------------------------|:-------:|:--------:|:-------:|:------:|:-----------:|:----------:|:-------:|:-----------:|
59
+ | Zamba2-2.7B-Instruct-v2 | 2.66 | 71.92 | 22.42 | 6.13 | 6.47 | 24.40 | 14.97 | 24.38 |
60
+ | Zamba2-2.7B-Instruct | 2.66 | 46.56 | 21.32 | 4.09 | 5.71 | 23.18 | 8.56 | 18.24 |
61
+ | Granite-3.2-2B-Instruct | 2.53 | 63.03 | 26.87 | 6.09 | 13.32 | 27.80 | 3.74 | 23.48 |
62
+ | Qwen-2.5-3B-Instruct | 3.09 | 65.02 | 30.98 | 2.03 | 34.73 | 32.59 | 7.28 | 28.77 |
63
+ | Llama3.2-3B-Instruct | 3.21 | 73.87 | 29.31 | 4.06 | 17.12 | 32.01 | 1.74 | 26.22 |
64
+ | Gemma-2-2b-it | 2.61 | 19.76 | 24.42 | 2.58 | 1.04 | 25.80 | 7.16 | 13.46 |
65
 
66
  Moreover, due to its unique hybrid SSM architecture, Zamba2-2.7B-Instruct-v2 achieves extremely low inference latency and rapid generation with a significantly smaller memory footprint than comparable transformer-based models.
67