MU-NLPC/whisper-small-audio-captioning · Bug in "Use in Transformers" code

Feb 27, 2024

•

edited Feb 27, 2024

Code currently says:

# Load model directly
from transformers import AutoProcessor, WhisperForAudioCaptioning

processor = AutoProcessor.from_pretrained("MU-NLPC/whisper-small-audio-captioning")
model = WhisperForAudioCaptioning.from_pretrained("MU-NLPC/whisper-small-audio-captioning")

however on a vanilla transformers install, this fails as WhisperForAudioCaptioning isn't available. Can you provide instructions on how to install this new library?

EDIT:
i've gotten it figured out from some pretty hacky processes with installing the GH, but either way, it isn't a really ideal new user flow. Would love for the experience of getting into it to be easier.

prompteus

NLP Centre, Faculty of Informatics, Masaryk University org Feb 27, 2024

Hi,
Can you point me to a code that says that? It must be a mistake because WhisperForAudioCaptioning is our class, not from huggingface transformers.

Anyways, you can find the definition of WhisperForAudioCaptioning class here on huggingface hub in the project files https://huggingface.co/MU-NLPC/whisper-small-audio-captioning/tree/main

You can also use our training and inference scripts directly with github repository: https://github.com/prompteus/audio-captioning

prompteus changed discussion status to closed Mar 13, 2024