Having issue using the model with vllm
#1
by
imadoualid
- opened
Hello i tried using the model with vllm using a sagemaker instance (ml.g5.24 96 gb 4 * 24 )
the model output weird things :
llm = LLM(model="kaitchup/Qwen2.5-Coder-32B-Instruct-AutoRound-GPTQ-4bit", download_dir="user-default-efs/kaitchup/Qwen2.5-Coder-32B-Instruct-AutoRound-GPTQ-4bit", tensor_parallel_size=4, max_model_len=2048, )
outputs = llm.generate([text],
sampling_params,
)
for output in outputs:
prompt = output.prompt
generated_text = output.outputs[0].text
print(prompt)
print(generated_text)
print("="*60)
<|im_start|>system
You are Qwen, created by Alibaba Cloud. You are a helpful assistant.<|im_end|>
<|im_start|>user
Tell me something about large language models.<|im_end|>
<|im_start|>assistant
/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk mktime mktime/Dk/Dk mktime mktime mktime mktime mktime mktime mktime mktime mktime mktime mktime/Dk/Dk/Dk mktime mktimeitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitantitant/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk/Dk
============================================================
Have u had a similar issue ?