Classical/Yinka · Hugging Face

Yinka

Yinka embedding 模型是在开原模型stella-v3.5-mrl上续训的，采用了piccolo2提到的多任务混合损失(multi-task hybrid loss training)。同样本模型也支持了可变的向量维度。

使用方法

该模型的使用方法同stella-v3.5-mrl一样, 无需任何前缀。

from sentence_transformers import SentenceTransformer
from sklearn.preprocessing import normalize

model = SentenceTransformer("Classical/Yinka")
# 注意先不要normalize! 选取前n维后再normalize
vectors = model.encode(["text1", "text2"], normalize_embeddings=False)
print(vectors.shape)  # shape is [2,1792]
n_dims = 768
cut_vecs = normalize(vectors[:, :n_dims])

结果

Model Name	Model Size (GB)	Dimension	Sequence Length	Classification (9)	Clustering (4)	Pair Classification (2)	Reranking (4)	Retrieval (8)	STS (8)	Average (35)
Yinka	1.21	1792	512	74.30	61.99	89.87	69.77	74.40	63.30	70.79
stella-v3.5-mrl	1.21	1792	512	71.56	54.39	88.09	68.45	73.51	62.48	68.56
piccolo-large-zh-v2	1.21	1792	512	74.59	62.17	90.24	70	74.36	63.5	70.95