infgrad/stella-base-zh-v3-1792d · 是否漏掉了一些模型参数？

Jun 19, 2024

我用SentenceTransformer的时候，报了warning
Some weights of BertModel were not initialized from the model checkpoint at stella-base-zh-v3-1792d and are newly initialized: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

我看模型参数里面没有bert.pooler，好像是BertModel里面用了pooler，但是实际pytorch_model.bin里面是没有这两个参数

怎么修改呢，是漏掉pytorch_model.bin漏了参数，还是修改配置就可以？

infgrad

Owner Jun 19, 2024

你好训练和推理都不适用pooler层的权重，所以不需要，

hulianxue

Jun 19, 2024

ok所以应该用以下方式load：
model = SentenceTransformer("infgrad/stella-base-zh-v3-1792d", model_kwargs={"add_pooling_layer": False})
就不会有warning了，这个时候bert-model没有pooling层

（
注：因为模型有3层，tansformer/pooling/dense
transformer实际就是Transformers包里面的BertModel，不声明add_pooling_layer=false的话，BertModel默认会加一个linear的pooler的——不管第二次的pooling是什么
）

hulianxue

Jun 19, 2024

ok所以应该用以下方式load：
model = SentenceTransformer("infgrad/stella-base-zh-v3-1792d", model_kwargs={"add_pooling_layer": False})
就不会有warning了，这个时候bert-model没有pooling层

（
注：因为模型有3层，tansformer/pooling/dense
transformer实际就是Transformers包里面的BertModel，不声明add_pooling_layer=false的话，BertModel默认会加一个linear的pooler的——不管第二次的pooling是什么
）

是否漏掉了 一些模型参数？

是否漏掉了一些模型参数？