Bug in "Use in Transformers" code
Code currently says:
# Load model directly
from transformers import AutoProcessor, WhisperForAudioCaptioning
processor = AutoProcessor.from_pretrained("MU-NLPC/whisper-small-audio-captioning")
model = WhisperForAudioCaptioning.from_pretrained("MU-NLPC/whisper-small-audio-captioning")
however on a vanilla transformers install, this fails as WhisperForAudioCaptioning
isn't available. Can you provide instructions on how to install this new library?
EDIT:
i've gotten it figured out from some pretty hacky processes with installing the GH, but either way, it isn't a really ideal new user flow. Would love for the experience of getting into it to be easier.
Hi,
Can you point me to a code that says that? It must be a mistake because WhisperForAudioCaptioning is our class, not from huggingface transformers.
Anyways, you can find the definition of WhisperForAudioCaptioning class here on huggingface hub in the project files https://huggingface.co/MU-NLPC/whisper-small-audio-captioning/tree/main
You can also use our training and inference scripts directly with github repository: https://github.com/prompteus/audio-captioning