What will be the expected training loss and eval loss by loading this adapter for continual training on this task?

#1
by Xtracta-Qiming - opened

Hi,

I am loading the adapter and working on continual training on the same ChartQA task and I find the loss and eval_loss looks like not working as continual training and it's pretty much like training from scratch. Here is how I load the adapter. Thanks for any suggestions!

model_id = "Qwen/Qwen2-VL-7B-Instruct"
# adapter_path = "./qwen2-2b-instruct-trl-sft-ChartQA/checkpoint-264"
adapter_path = "sergiopaniego/qwen2-7b-instruct-trl-sft-ChartQA"

"""Next, we’ll load the model and the tokenizer to prepare for inference."""

model = Qwen2VLForConditionalGeneration.from_pretrained(
    model_id,
    device_map="auto",
    torch_dtype=torch.bfloat16,
)
model.load_adapter(adapter_path)

image.png

Hi @Xtracta-Qiming !

Did you manage to solve the issue?
You need to follow the same steps to load the model as in the recipe. Also, since we only use a subset of the dataset in the recipe, if you use another subset that could also be causing the problem.

Sign up or log in to comment