Missing weights for example code
I install cramming and use the latest transformers.
The example code can be run, but many weights cannot be loaded.
Is this normal?
Some weights of the model checkpoint at ./models/pbelcak_FastBERT-1x11-long were not used when initializing ScriptableLMForPreTraining: ['encoder.layers.7.ffn.linear_in.bias', 'encoder.layers.12.ffn.linear_out.weight', 'encoder.layers.13.ffn.linear_out.weight', 'encoder.layers.6.ffn.linear_out.weight', 'encoder.layers.3.ffn.linear_in.weight', 'encoder.layers.12.ffn.linear_in.weight', 'encoder.layers.14.ffn.linear_in.bias', 'encoder.layers.2.ffn.linear_out.weight', 'encoder.layers.9.ffn.linear_in.bias', 'encoder.layers.0.ffn.linear_in.bias', 'encoder.layers.4.ffn.linear_out.weight', 'encoder.layers.6.ffn.linear_in.weight', 'encoder.layers.4.ffn.linear_in.weight', 'encoder.layers.11.ffn.linear_in.weight', 'encoder.layers.14.ffn.linear_in.weight', 'encoder.layers.2.ffn.linear_in.bias', 'encoder.layers.5.ffn.linear_out.weight', 'encoder.layers.10.ffn.linear_in.bias', 'encoder.layers.3.ffn.linear_out.weight', 'encoder.layers.7.ffn.linear_in.weight', 'encoder.layers.8.ffn.linear_out.weight', 'encoder.layers.9.ffn.linear_out.weight', 'encoder.layers.15.ffn.linear_in.bias', 'encoder.layers.13.ffn.linear_in.weight', 'encoder.layers.0.ffn.linear_in.weight', 'encoder.layers.10.ffn.linear_out.weight', 'encoder.layers.5.ffn.linear_in.weight', 'encoder.layers.6.ffn.linear_in.bias', 'encoder.layers.4.ffn.linear_in.bias', 'encoder.layers.15.ffn.linear_out.weight', 'encoder.layers.10.ffn.linear_in.weight', 'encoder.layers.13.ffn.linear_in.bias', 'encoder.layers.5.ffn.linear_in.bias', 'encoder.layers.2.ffn.linear_in.weight', 'encoder.layers.11.ffn.linear_in.bias', 'encoder.layers.1.ffn.linear_in.bias', 'encoder.layers.12.ffn.linear_in.bias', 'encoder.layers.8.ffn.linear_in.bias', 'encoder.layers.8.ffn.linear_in.weight', 'encoder.layers.1.ffn.linear_in.weight', 'encoder.layers.1.ffn.linear_out.weight', 'encoder.layers.3.ffn.linear_in.bias', 'encoder.layers.9.ffn.linear_in.weight', 'encoder.layers.0.ffn.linear_out.weight', 'encoder.layers.14.ffn.linear_out.weight', 'encoder.layers.15.ffn.linear_in.weight', 'encoder.layers.11.ffn.linear_out.weight', 'encoder.layers.7.ffn.linear_out.weight']
- This IS expected if you are initializing ScriptableLMForPreTraining from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing ScriptableLMForPreTraining from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of ScriptableLMForPreTraining were not initialized from the model checkpoint at ./models/pbelcak_FastBERT-1x11-long and are newly initialized: ['encoder.layers.14.ffn.dense_in.weight', 'encoder.layers.15.ffn.dense_out.weight', 'encoder.layers.15.ffn.dense_in.weight', 'encoder.layers.13.ffn.dense_in.weight', 'encoder.layers.11.ffn.dense_in.weight', 'encoder.layers.5.ffn.dense_out.weight', 'encoder.layers.12.ffn.dense_out.weight', 'encoder.layers.9.ffn.dense_out.weight', 'encoder.layers.1.ffn.dense_out.weight', 'encoder.layers.5.ffn.dense_in.weight', 'encoder.layers.8.ffn.dense_out.weight', 'encoder.layers.0.ffn.dense_out.weight', 'encoder.layers.8.ffn.dense_in.weight', 'encoder.layers.6.ffn.dense_in.weight', 'encoder.layers.4.ffn.dense_in.weight', 'encoder.layers.10.ffn.dense_out.weight', 'encoder.layers.4.ffn.dense_out.weight', 'encoder.layers.2.ffn.dense_out.weight', 'encoder.layers.11.ffn.dense_out.weight', 'encoder.layers.14.ffn.dense_out.weight', 'encoder.layers.0.ffn.dense_in.weight', 'encoder.layers.3.ffn.dense_out.weight', 'encoder.layers.13.ffn.dense_out.weight', 'encoder.layers.3.ffn.dense_in.weight', 'encoder.layers.1.ffn.dense_in.weight', 'encoder.layers.6.ffn.dense_out.weight', 'encoder.layers.10.ffn.dense_in.weight', 'encoder.layers.12.ffn.dense_in.weight', 'encoder.layers.2.ffn.dense_in.weight', 'encoder.layers.9.ffn.dense_in.weight', 'encoder.layers.7.ffn.dense_out.weight', 'encoder.layers.7.ffn.dense_in.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Hello,
What your warnings are saying is that they found weights for the FFF
module (training/cramming/architectures/fff.py
) but the model is trying to load the weights for the FFNComponent
module (training/cramming/crammed_bert.py
). I just tried running the README example with a fresh instance and I could not reproduce your warnings.
You're most likely using cramming
installed from the original cramming
repository and not from the training
directory of this project.
To recap, these are the steps:
pip uninstall cramming
to remove the previous version of cramming installed in your environment -- or just start with a fresh environment.cd training
pip install .
- Create
minimal_example.py
- Paste
import cramming
from transformers import AutoModelForMaskedLM, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("pbelcak/FastBERT-1x11-long")
model = AutoModelForMaskedLM.from_pretrained("pbelcak/FastBERT-1x11-long")
text = "Replace me by any text you'd like."
encoded_input = tokenizer(text, return_tensors='pt')
output = model(**encoded_input)
python minimal_example.py
.
there is no training folder in your repo. Am I looking to the wrong place?
Hi batrlatom,
This is the folder in the repo:
Hello,
What your warnings are saying is that they found weights for the
FFF
module (training/cramming/architectures/fff.py
) but the model is trying to load the weights for theFFNComponent
module (training/cramming/crammed_bert.py
). I just tried running the README example with a fresh instance and I could not reproduce your warnings.You're most likely using
cramming
installed from the originalcramming
repository and not from thetraining
directory of this project.To recap, these are the steps:
pip uninstall cramming
to remove the previous version of cramming installed in your environment -- or just start with a fresh environment.cd training
pip install .
- Create
minimal_example.py
- Paste
import cramming from transformers import AutoModelForMaskedLM, AutoTokenizer tokenizer = AutoTokenizer.from_pretrained("pbelcak/FastBERT-1x11-long") model = AutoModelForMaskedLM.from_pretrained("pbelcak/FastBERT-1x11-long") text = "Replace me by any text you'd like." encoded_input = tokenizer(text, return_tensors='pt') output = model(**encoded_input)
python minimal_example.py
.
Yes, you are right. I do install the original cramming. Sorry that I missed it in the README and thank you very much for the reply.