使用scripts/eval_mteb.py 无法复现效果

#35
by whucai - opened

可能是我使用问题,烦请您指导一下
{
"dataset_revision": "e8379541af4e31359cca9fbcf4b00f2671dba205",
"mteb_dataset_name": "AmazonCounterfactualClassification",
"mteb_version": "1.2.0",
"test": {
"en": {
"accuracy": 0.7456716417910447,
"accuracy_stderr": 0.03959698960905179,
"ap": 0.3734162803359212,
"ap_stderr": 0.03324491038682893,
"f1": 0.685045611469467,
"f1_stderr": 0.034135974099816266,
"main_score": 0.7456716417910447
},
"evaluation_time": 9.86
}
}
运行代码
python eval_MTEB.py --model gte-Qwen2-1.5B-instruct --task mteb --output_dir gte-Qwen2-1.5B-instruct --pooling last

This matches with an open discussion here: https://github.com/embeddings-benchmark/mteb/issues/1600

Alibaba-NLP org

抱歉,我们检查了一下之前脚本中存在一些错误,现在已经更新了,验证了一下与榜单上的结果保持一致,可以用最新的脚本再试一下。

Sign up or log in to comment