peralp24 commited on
Commit
5c65774
·
verified ·
1 Parent(s): 1076870

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -0
README.md CHANGED
@@ -201,6 +201,20 @@ and ultimately lead to embeddings that are more useful for your use-case.
201
  - In cases where the two texts to compare are different in nature (e.g. query and document) – also called “asymmetric” – we suggest to first add an instruction to query texts only. Again, try and ideally evaluate the model in this setting. Then, if your aim is to further boost performance, we suggest that you add instructions to document texts as well where [X] and [Y] are flipped accordingly.
202
 
203
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
204
 
205
 
206
 
 
201
  - In cases where the two texts to compare are different in nature (e.g. query and document) – also called “asymmetric” – we suggest to first add an instruction to query texts only. Again, try and ideally evaluate the model in this setting. Then, if your aim is to further boost performance, we suggest that you add instructions to document texts as well where [X] and [Y] are flipped accordingly.
202
 
203
 
204
+ ## Evaluation
205
+
206
+ Pharia-1-Embedding-4608-control has not been optimized for [MTEB](https://github.com/embeddings-benchmark/mteb) (a generic benchmark),
207
+ and naturally would be expected to underperform on it as we optimize instead for real-world usage and multilinguality.
208
+ Nonetheless, for comparability we share results on a subset of tasks of the
209
+ English MTEB benchmark. The subset contains tasks from all task types (classification, summarization, etc.) of
210
+ the full benchmark and is therefore roughly representative of it.
211
+
212
+ #### MTEB – English
213
+ For this evaluation we use task-specific instructions from [MEDI2](https://huggingface.co/datasets/GritLM/MEDI2).
214
+
215
+ |Model Name|ArguAna|AskUbuntuDupQuestions|BIOSSES|Banking77Classification|EmotionClassification|MedrxivClusteringS2S|NFCorpus|STS17|STSBenchmark|SciFact|SummEval|TwitterSemEval2015|Average|
216
+ |--|--|--|--|--|--|--|--|--|--|--|--|--|--|
217
+ |Pharia-1-Embedding-4608-control|51.09|61.71|84.56|86.37|51.77|34.29|37.82|89.56|87.08|69.7|30.95|70.97|**62.99**|
218
 
219
 
220