Having issue using the model with vllm

#1
by imadoualid - opened

Hello i tried using the model with vllm using a sagemaker instance (ml.g5.24 96 gb 4 * 24 )
the model output weird things :
llm = LLM(model="kaitchup/Qwen2.5-Coder-32B-Instruct-AutoRound-GPTQ-4bit", download_dir="user-default-efs/kaitchup/Qwen2.5-Coder-32B-Instruct-AutoRound-GPTQ-4bit", tensor_parallel_size=4, max_model_len=2048, )

outputs = llm.generate([text], 
                       sampling_params, 
                    )   

for output in outputs:
    prompt = output.prompt
    generated_text = output.outputs[0].text
    print(prompt)
    print(generated_text)
    print("="*60)

<|im_start|>system
You are Qwen, created by Alibaba Cloud. You are a helpful assistant.<|im_end|>
<|im_start|>user
Tell me something about large language models.<|im_end|>
<|im_start|>assistant

/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk mktime mktime/Dk/Dk mktime mktime mktime mktime mktime mktime mktime mktime mktime mktime mktime/Dk/Dk/Dk mktime mktimeitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitant/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk
============================================================

Have u had a similar issue ?

Sign up or log in to comment