Error when loading model

#14

by lferrer - opened May 15

May 15

Hello,

When using the code provided in the readme for inference, I get the following error:

Some weights of the model checkpoint at alefiury/wav2vec2-large-xlsr-53-gender-recognition-librispeech were not used when initializing Wav2Vec2ForSequenceClassification: ['wav2vec2.encoder.pos_conv_embed.conv.weight_g', 'wav2vec2.encoder.pos_conv_embed.conv.weight_v']

This IS expected if you are initializing Wav2Vec2ForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
This IS NOT expected if you are initializing Wav2Vec2ForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of Wav2Vec2ForSequenceClassification were not initialized from the model checkpoint at alefiury/wav2vec2-large-xlsr-53-gender-recognition-librispeech and are newly initialized: ['wav2vec2.encoder.pos_conv_embed.conv.parametrizations.weight.original0', 'wav2vec2.encoder.pos_conv_embed.conv.parametrizations.weight.original1']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

I am guessing this is the reason why I am geting kind of random results. Is it a version problem?
Thank you in advance for any help,

Luciana

alefiury

Owner May 20

Hello Luciana,

Thank you for using this model and for reporting this issue.

That warning message is indeed unexpected. It essentially means that some parts of the pre-trained model (the "weights") are not fitting correctly into the model structure you're loading. This could certainly lead to the random results you're observing, as the model isn't configured as intended with all its learned parameters.

I tested the model recently, and this warning did not appear. For reference, the versions I am currently using where the model works as expected are:

torch: 2.5.1+cu124 (or your specific PyTorch version)
transformers: 4.48.3 (or your specific Transformers version)

Updating these libraries in your environment to versions similar to these might help resolve the issue.

I built a Google Colab to test out this model applied to the LibriSpeech dataset, and it seems to be working as intended there. You can use it as a reference to compare environments and potentially pinpoint any discrepancies in your setup.

Let me know if updating helps or if you have any more problems.

alefiury changed discussion status to closed Jun 21

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment