how to inference with vllm

#13
by myrulezzz - opened

having issues inferencing fondation-sec-8b with vllm. this is my setup:
--port 8000 --model fdtn-ai/Foundation-Sec-8B --enable-auto-tool-choice --tool-call-parser llama3_json --chat-template examples/tool_chat_template_llama3.1_json.jinja
these are my parameters for vllm serve. Am i missing something?

Sign up or log in to comment