About loading ckpt in my local machine

#8
by m1ku2 - opened

Hi lmms-lab team,
I'm trying to use your lmms-lab/llama3-llava-next-8b model and have downloaded the checkpoint files to my local machine.
When attempting to load the model locally using the following standard Hugging Face approach:
from transformers import AutoProcessor, AutoModelForCausalLM

model_path = "path/to/my/local/lmms-lab/llama3-llava-next-8b" # My local path

processor = AutoProcessor.from_pretrained(model_path, trust_remote_code=True) # Assuming processor loads fine

model = AutoModelForCausalLM.from_pretrained(model_path, trust_remote_code=True)

I encounter the following error:
ValueError: Unrecognized configuration class <class 'transformers.models.llava.configuration_llava.LlavaConfig'> for this kind of AutoModel: AutoModelForCausalLM.
Model type should be one of BartConfig, BertConfig, ..., ZambaConfig.

This error suggests that AutoModelForCausalLM is not the correct AutoClass for this model, likely due to its multimodal LLaVA architecture which uses LlavaConfig.
Could you please provide the recommended Python code snippet or specific model class that should be used to load the lmms-lab/llama3-llava-next-8b model checkpoint correctly from a local directory?
For instance, should I be using a more specific class like LlavaForConditionalGeneration (or a class specific to LLaVA-NeXT, if different), or perhaps a different AutoClass like AutoModelForVision2Seq?(But when I use LlavaForConditionalGeneration, I encounter “Some weights of
LlavaForConditionalGeneration were not initialized from the model checkpoint”
Any guidance on the proper local loading procedure would be very helpful.
Thanks for making this model available!
Best regards!

m1ku2 changed discussion title from About loading ckpt to About loading ckpt in
m1ku2 changed discussion title from About loading ckpt in to About loading ckpt in my local machine

Sign up or log in to comment