Update README.md
Browse files
README.md
CHANGED
@@ -273,16 +273,44 @@ from [mteb/scripts/task_selection/europe_tasks.csv at main 路 embeddings-benchma
|
|
273 |
|
274 |
|
275 |
#### Evaluations on cross-lingual capabilities
|
276 |
-
There are important use cases where one wants to retrieve multiple documents on a topic or answering questions that are formulated in a
|
277 |
-
different language than the query. This increases recall and information retrieval coverage. For testing on cross-lingual capabilities
|
278 |
-
evaluated Pharia-1-Embedding-4608-control and
|
279 |
-
For German/French we used the CLSD-WMT19 dataset providing correct and adversarial translations
|
280 |
-
|
281 |
-
|
282 |
-
|
283 |
-
|
284 |
-
|
285 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
286 |
|
287 |
|
288 |
## Training Details
|
|
|
273 |
|
274 |
|
275 |
#### Evaluations on cross-lingual capabilities
|
276 |
+
There are important use cases where one wants to retrieve multiple documents on a topic or answering questions that are formulated in a
|
277 |
+
different language than the query. This increases recall and information retrieval coverage. For testing on cross-lingual capabilities
|
278 |
+
we evaluated Pharia-1-Embedding-4608-control, GritLM and Nvidia-Embed-v2 on the MLQA-V1 datasets (Facebook) for German/English and
|
279 |
+
English/Spanish language pairings. For German/French we used the CLSD-WMT19 dataset providing correct and adversarial translations
|
280 |
+
of a sentence in the corresponding pair language. In order to check quality over a larger range of sample size we did the accuracy
|
281 |
+
computations for varying number of samples taken from the MLQA-V1 dataset. For the CLSD-WMT19 evaluation we employed the
|
282 |
+
full set of data (2900 samples available).
|
283 |
+
|
284 |
+
|
285 |
+
#### MLQA-V1 Ger/Eng cross-lingual accuracies for the considered models
|
286 |
+
|
287 |
+
|# of samples|Pharia4608|GritLM|Nvidia-Embed-v2|
|
288 |
+
|:---:|:---:|:---:|:---:|
|
289 |
+
|1000|86.0%|82.5%|77.0%|
|
290 |
+
|2000|79.5%|73.4%|69.4%|
|
291 |
+
|4000|65.3%|59.2%|56.0%|
|
292 |
+
|6000|54.3%|48.6%|45.6%|
|
293 |
+
|10000|38.6%|32.8%|32.8%|
|
294 |
+
|
295 |
+
|
296 |
+
#### MLQA-V1 Eng/Esp cross-lingual accuracies for the considered models
|
297 |
+
|
298 |
+
|# samples|Pharia4608|GritLM|NV-Embed-v2|
|
299 |
+
|:---:|:---:|:---:|:---:|
|
300 |
+
|1000|87.5%|82.0%|81.5%|
|
301 |
+
|2000|78.5%|73.9%|70.7%|
|
302 |
+
|4000|65.5%|59.3%|56.9%|
|
303 |
+
|6000|55.3%|49.2%|46.2%|
|
304 |
+
|10000|41.7%|35.5%|33.2%|
|
305 |
+
|
306 |
+
#### CLSD-WMT19 Ger/Fra (2900 samples) cross-lingual evaluation for the considered models
|
307 |
+
|
308 |
+
|
309 |
+
|Model Name | accuracy |
|
310 |
+
|:-----------------------------:|:--------------------------------:|
|
311 |
+
|Pharia-1-Embedding-4608-control|95.1% |
|
312 |
+
|GritLM-7B |94.2% |
|
313 |
+
|Nvidia-Embed-v2 |93.4% |
|
314 |
|
315 |
|
316 |
## Training Details
|