是否漏掉了 一些模型参数?

#4
by hulianxue - opened

我用SentenceTransformer的时候,报了warning
Some weights of BertModel were not initialized from the model checkpoint at stella-base-zh-v3-1792d and are newly initialized: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

我看模型参数里面没有bert.pooler,好像是BertModel里面用了pooler,但是实际pytorch_model.bin里面是没有这两个参数

怎么修改呢,是漏掉pytorch_model.bin漏了参数,还是修改配置就可以?

Owner

你好训练和推理都不适用pooler层的权重,所以不需要,

ok所以应该用以下方式load:
model = SentenceTransformer("infgrad/stella-base-zh-v3-1792d", model_kwargs={"add_pooling_layer": False})
就不会有warning了,这个时候bert-model没有pooling层


注:因为模型有3层,tansformer/pooling/dense
transformer实际就是Transformers包里面的BertModel,不声明add_pooling_layer=false的话,BertModel默认会加一个linear的pooler的——不管第二次的pooling是什么

ok所以应该用以下方式load:
model = SentenceTransformer("infgrad/stella-base-zh-v3-1792d", model_kwargs={"add_pooling_layer": False})
就不会有warning了,这个时候bert-model没有pooling层


注:因为模型有3层,tansformer/pooling/dense
transformer实际就是Transformers包里面的BertModel,不声明add_pooling_layer=false的话,BertModel默认会加一个linear的pooler的——不管第二次的pooling是什么

Sign up or log in to comment