Aleph-Alpha
/

Pharia-1-Embedding-4608-control

Model card Files Files and versions Community

peralp24 commited on Dec 12, 2024

Commit

c0962b2

verified ·

1 Parent(s): 40a5d17

Update README.md

Browse files

Files changed (1) hide show

README.md +38 -10

README.md CHANGED Viewed

@@ -273,16 +273,44 @@ from [mteb/scripts/task_selection/europe_tasks.csv at main · embeddings-benchma
 #### Evaluations on cross-lingual capabilities
-There are important use cases where one wants to retrieve multiple documents on a topic or answering questions that are formulated in a
-different language than the query. This increases recall and information retrieval coverage. For testing on cross-lingual capabilities
-evaluated Pharia-1-Embedding-4608-control and GritLM on the MLQA-V1 datasets (Facebook) for German/English and English/Spanish language pairings.
-For German/French we used the CLSD-WMT19 dataset providing correct and adversarial translations of a sentence in the corresponding pair language.
-|Model Name                     |MLQA-V1 Ger/Eng (2000 samples)| MLQA-V1 Eng/Esp (2000 samples)| CLSD-WMT19 Ger/Fra (2900 samples)|
-|:-----------------------------:|:----------------------------:|:-----------------------------:|:--------------------------------:|
-|Pharia-1-Embedding-4608-control|79.5%                         |78.5%                          |95.1%                             |
-|GritLM-7B                      |73.4%                         |73.9%                          |94.2%                             |
-|Nvidia-Embed-v2                |69.4%                         |70.7%                          |93.4%                             |
 ## Training Details

 #### Evaluations on cross-lingual capabilities
+There are important use cases where one wants to retrieve multiple documents on a topic or answering questions that are formulated in a
+different language than the query. This increases recall and information retrieval coverage. For testing on cross-lingual capabilities
+we evaluated Pharia-1-Embedding-4608-control, GritLM and Nvidia-Embed-v2 on the MLQA-V1 datasets (Facebook) for German/English and
+English/Spanish language pairings. For German/French we used the CLSD-WMT19 dataset providing correct and adversarial translations
+of a sentence in the corresponding pair language. In order to check quality over a larger range of sample size we did the accuracy
+computations for varying number of samples taken from the MLQA-V1 dataset. For the CLSD-WMT19 evaluation we employed the
+full set of data (2900 samples available).
+#### MLQA-V1 Ger/Eng cross-lingual accuracies for the considered models
+|# of samples|Pharia4608|GritLM|Nvidia-Embed-v2|
+|:---:|:---:|:---:|:---:|
+|1000|86.0%|82.5%|77.0%|
+|2000|79.5%|73.4%|69.4%|
+|4000|65.3%|59.2%|56.0%|
+|6000|54.3%|48.6%|45.6%|
+|10000|38.6%|32.8%|32.8%|
+#### MLQA-V1 Eng/Esp cross-lingual accuracies for the considered models
+|# samples|Pharia4608|GritLM|NV-Embed-v2|
+|:---:|:---:|:---:|:---:|
+|1000|87.5%|82.0%|81.5%|
+|2000|78.5%|73.9%|70.7%|
+|4000|65.5%|59.3%|56.9%|
+|6000|55.3%|49.2%|46.2%|
+|10000|41.7%|35.5%|33.2%|
+#### CLSD-WMT19 Ger/Fra (2900 samples) cross-lingual evaluation for the considered models
+|Model Name                     | accuracy |
+|:-----------------------------:|:--------------------------------:|
+|Pharia-1-Embedding-4608-control|95.1%                             |
+|GritLM-7B                      |94.2%                             |
+|Nvidia-Embed-v2                |93.4%                             |
 ## Training Details