Safetensors
llama
TBergmanis commited on
Commit
e26c640
·
verified ·
1 Parent(s): e3895ef

Update README.md

Browse files

Adjusted results. Now takes into account that not all models have BOS tokens. Excludes loss computation on the first token for all models. Adds EuroLLM 22B Preview

Files changed (1) hide show
  1. README.md +29 -30
README.md CHANGED
@@ -120,33 +120,32 @@ Character-level perplexity creates a standardised comparison by calculating how
120
  **What data did we use?**
121
  We use WMT24++ as it is a multilingual, language-parallel evaluation set that none of the models have seen during training. WMT24++ is a composite of texts from news, literature, speech, and social media; thus, it is suitable for foundational model benchmarking.
122
 
123
- | Language | TildeOpen-30B | Gemma-2-27B | EuroLLM-9B | ALIA-40B |
124
- |----------|---------------|-------------|------------|-----------------|
125
- | Bulgarian | **2.1716** | 2.3541 | 2.3502 | 2.2411 |
126
- | Croatian | **2.2259** | 2.6809 | 2.6780 | 2.3456 |
127
- | Czech | **2.2682** | 2.4873 | 2.4808 | 2.3639 |
128
- | Danish | **2.0968** | 2.2608 | 2.2586 | 2.1543 |
129
- | Dutch | **2.0136** | 2.1249 | 2.1185 | 2.0629 |
130
- | English | 2.1497 | **2.0342** | 2.1897 | 2.1027 |
131
- | Estonian | **2.2825** | 2.7163 | 2.5652 | 2.4232 |
132
- | Finnish | **2.1687** | 2.4069 | 2.3844 | 2.2774 |
133
- | French | 1.9779 | 2.0195 | 2.0479 | **1.9750** |
134
- | German | **1.9664** | 2.0214 | 2.0499 | 1.9725 |
135
- | Hungarian | **2.1481** | 2.3308 | 2.3705 | 2.2493 |
136
- | Icelandic | **2.2011** | 3.1917 | 5.3162 | 4.0978 |
137
- | Italian | **2.0431** | 2.1065 | 2.1213 | 2.0604 |
138
- | Latvian | **2.2477** | 2.6701 | 2.4896 | 2.4352 |
139
- | Lithuanian | **2.2301** | 2.5495 | 2.4754 | 2.4109 |
140
- | Norwegian | **2.2445** | 2.4173 | 2.5121 | 2.3152 |
141
- | Polish | **2.1214** | 2.2294 | 2.2264 | 2.1847 |
142
- | Portuguese | **2.0810** | 2.1554 | 2.1561 | 2.0884 |
143
- | Romanian | **2.1266** | 2.2724 | 2.2821 | 2.1974 |
144
- | Russian | **2.1502** | 2.2091 | 2.2813 | 2.1889 |
145
- | Serbian | **2.3708** | 2.8053 | 4.7160 | 2.5119 |
146
- | Slovak | **2.2281** | 2.4674 | 2.4588 | 2.3505 |
147
- | Slovenian | **2.2662** | 2.5798 | 2.5087 | 2.3611 |
148
- | Spanish | 2.0400 | 2.0665 | 2.1186 | **2.0055** |
149
- | Swedish | **2.1471** | 2.2971 | 2.2856 | 2.2039 |
150
- | Turkish | **2.2108** | 2.3665 | 2.3508 | 3.0611 |
151
- | Ukrainian | **2.2470** | 2.4000 | 2.4251 | 2.3168 |
152
-
 
120
  **What data did we use?**
121
  We use WMT24++ as it is a multilingual, language-parallel evaluation set that none of the models have seen during training. WMT24++ is a composite of texts from news, literature, speech, and social media; thus, it is suitable for foundational model benchmarking.
122
 
123
+ | Language | TildeOpen 30b | Gemma 2 27b | EuroLLM 22B Prev. | ALIA 40B |
124
+ |-----------------|---------|------------|----|------|
125
+ | Bulgarian | **2.0539** | 2.2184 | 2.1985 | 2.1336 |
126
+ | Czech | **2.1579** | 2.3522 | 2.3221 | 2.2719 |
127
+ | Danish | **2.003** | 2.1517 | 2.1353 | 2.0805 |
128
+ | German | **1.8769** | 1.9285 | 1.9452 | 1.904 |
129
+ | English | 2.0378 | **1.9525** | 2.0568 | 2.0261 |
130
+ | Spanish | 1.9503 | 1.9752 | 2.0145 | **1.9369** |
131
+ | Estonian | **2.1711** | 2.5747 | 2.3852 | 2.325 |
132
+ | Finnish | **2.0497** | 2.288 | 2.2388 | 2.1831 |
133
+ | French | **1.8978** | 1.9355 | 1.9282 | 1.9084 |
134
+ | Croatian | **2.1147** | 2.544 | 2.4905 | 2.2433 |
135
+ | Hungarian | **2.0539** | 2.2228 | 2.2256 | 2.1635 |
136
+ | Icelandic | **2.0873** | 3.0329 | 4.7908 | 3.957 |
137
+ | Italian | **1.9565** | 2.0137 | 2.0098 | 1.9887 |
138
+ | Lithuanian | **2.1247** | 2.4175 | 2.3137 | 2.3075 |
139
+ | Latvian | **2.1439** | 2.5355 | 2.3141 | 2.3276 |
140
+ | Dutch | **1.9333** | 2.0312 | 2.0079 | 1.9904 |
141
+ | Norwegian | **2.1284** | 2.2862 | 2.3506 | 2.2253 |
142
+ | Polish | **2.0241** | 2.1294 | 2.0803 | 2.0803 |
143
+ | Portuguese | **1.9899** | 2.0597 | 2.0272 | 2.0187 |
144
+ | Romanian | **2.0196** | 2.1606 | 2.1641 | 2.1114 |
145
+ | Russian | **2.0424** | 2.09 | 2.1095 | 2.0871 |
146
+ | Slovak | **2.1192** | 2.338 | 2.3029 | 2.2609 |
147
+ | Slovenian | **2.1556** | 2.4443 | 2.3398 | 2.2589 |
148
+ | Serbian | **2.2469** | 2.6351 | 4.2471 | 2.3743 |
149
+ | Swedish | **2.041** | 2.1809 | 2.1464 | 2.1211 |
150
+ | Turkish | **2.0997** | 2.247 | 2.2202 | 2.232 |
151
+ | Ukrainian | **2.1376** | 2.2665 | 2.2691 | 2.2086 |