deepseek-ai/deepseek-vl2-small · seems deepseek vl2 small and vl2 cannot cannot perform properly

same code:
vl_chat_processor = DeepseekVLV2Processor.from_pretrained(model_path)
tokenizer = vl_chat_processor.tokenizer
vl_gpt = AutoModelForCausalLM.from_pretrained(model_path, trust_remote_code=True)
vl_gpt = vl_gpt.to(torch.bfloat16).cuda().eval()

conversation = [
{
"role": "<|User|>",
"content": f"\n{prompt}"
},
{"role": "<|Assistant|>", "content": ""}
]

pil_images = [image]  

prepare_inputs = vl_chat_processor(
    conversations=conversation,
    images=pil_images, 
    force_batchify=True,
    system_prompt=""
).to(vl_gpt.device)
print("prepare_inputs keys:", prepare_inputs.keys())

inputs_embeds = vl_gpt.prepare_inputs_embeds(**prepare_inputs)
print("inputs_embeds shape:", inputs_embeds.shape)
outputs = vl_gpt.language.generate(
    inputs_embeds=inputs_embeds,
    attention_mask=prepare_inputs.attention_mask,
    pad_token_id=tokenizer.eos_token_id,
    bos_token_id=tokenizer.bos_token_id,
    eos_token_id=tokenizer.eos_token_id,
    max_new_tokens=512,
    do_sample=False,
    use_cache=True
)

answer = tokenizer.decode(outputs[0].cpu().tolist(), skip_special_tokens=True)

both deepseek vl2 small and deepseek vl2 cannot give me a proper answer, however this code works properly using tiny model.
is that because the example code have some problem? outdated?
Thanks if anyone see the problem