peralp24 commited on
Commit
3dc62e1
verified
1 Parent(s): 2d3b9ae

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +37 -1
README.md CHANGED
@@ -252,6 +252,42 @@ from [mteb/scripts/task_selection/europe_tasks.csv at main 路 embeddings-benchma
252
  - i.e. this gives 20-2=18 translation pair subsets between the 5 languages. -2 because Italian 鈫旓笌 German doesn鈥檛 exist.
253
  - this is done because otherwise there are 250 translation pair subsets which are not as relevant (e.g. they contain Vietnamese 鈫旓笌 Portuguese)
254
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
255
 
256
  ## Bias, Risks, and Limitations
257
 
@@ -271,7 +307,7 @@ Use the code below to get started with the model.
271
 
272
  [More Information Needed]
273
 
274
- ## Training Details
275
 
276
  ### Training Data
277
 
 
252
  - i.e. this gives 20-2=18 translation pair subsets between the 5 languages. -2 because Italian 鈫旓笌 German doesn鈥檛 exist.
253
  - this is done because otherwise there are 250 translation pair subsets which are not as relevant (e.g. they contain Vietnamese 鈫旓笌 Portuguese)
254
 
255
+ #### Europe by task
256
+
257
+ | Model Name | AmazonCounterfactualClassification | BUCC.v2 | DiaBlaBitextMining | MassiveScenarioClassification | NTREXBitextMining | STS17 | Average |
258
+ |-------------------------------------------------------|-------------------------------------:|----------:|---------------------:|--------------------------------:|--------------------:|---------:|----------:|
259
+ | luminous-base-symmetric | 0.710921 | 0.990569 | 0.85374 | 0.710148 | 0.971263 | 0.879475 | 0.852686 |
260
+ | Pharia-7b-2048-medi1-causal-weighted-adapter | 0.735118 | 0.984346 | 0.822481 | 0.749375 | 0.968538 | 0.852473 | 0.852055 |
261
+ | Pharia-1-Embedding-4608-control | 0.724946 | 0.991884 | 0.865101 | 0.755763 | 0.982374 | 0.876741 | 0.866135 |
262
+ | GritLM-7B | 0.766381 | 0.994298 | 0.864504 | 0.789334 | 0.984593 | 0.880716 | 0.879971 |
263
+
264
+ #### Europe by language
265
+
266
+ | Model Name | deu-Latn | eng-Latn | fra-Latn | por-Latn | ita-Latn | spa-Latn | Average |
267
+ |-------------------------------------------------------|-----------:|-----------:|-----------:|-----------:|-----------:|-----------:|----------:|
268
+ | luminous-base-symmetric | 0.913887 | 0.90055 | 0.929288 | 0.927929 | 0.932836 | 0.93469 | 0.923197 |
269
+ | Pharia-7b-2048-medi1-causal-weighted-adapter | 0.914817 | 0.876927 | 0.918247 | 0.938783 | 0.92802 | 0.934084 | 0.91848 |
270
+ | Pharia-1-Embedding-4608-control | 0.925309 | 0.902113 | 0.937961 | 0.953719 | 0.942352 | 0.945642 | 0.934516 |
271
+ | GritLM-7B | 0.934603 | 0.905669 | 0.942364 | 0.962042 | 0.949731 | 0.947428 | 0.940306 |
272
+
273
+ ## Training Details
274
+
275
+ ### Model architecture
276
+
277
+ |:-------:|:-------:|
278
+ |Number of layers|27|
279
+ |Number of attention heads|36|
280
+ |Head size|128|
281
+ |Number of Key-Value heads|4|
282
+ |Size hidden dimension|4608|
283
+ |MLP expansion factor|4|
284
+ |MLP type|Standard|
285
+ |Vocabulary size|128,000|
286
+ |Rotary base|1,000,000|
287
+ |Total parameter count|7,041,544,704|
288
+
289
+
290
+
291
 
292
  ## Bias, Risks, and Limitations
293
 
 
307
 
308
  [More Information Needed]
309
 
310
+
311
 
312
  ### Training Data
313