predicted label is 32

#5
by aryanfar2025 - opened

The predicted_label tensor([32]) is 32 ?!

I have that as well. I have converted the code with chatgpt to the following and it outputs from 0 to 6.

feature_extractor = Wav2Vec2FeatureExtractor.from_pretrained("r-f/wav2vec-english-speech-emotion-recognition")
model = Wav2Vec2ForSequenceClassification.from_pretrained("r-f/wav2vec-english-speech-emotion-recognition")

audio_path = "OAF_youth_angry.wav"
audio, rate = librosa.load(audio_path, sr=16000)
inputs = feature_extractor(audio, sampling_rate=rate, return_tensors="pt", padding=True)

with torch.no_grad():
outputs = model(inputs.input_values)
print(outputs)
logits = outputs.logits
print(logits)

But I don't know why they put the wrong model here. I don't think the model gives us the correct output. If it doesn't, I am planning to train it again.

Sign up or log in to comment