Hi I wonder if anyone experience the same, when hosting the model with vLLM following instructions from huggingface, then connect the model to Cline(VScode extension), it's generating nonstop for any questions I asked.
· Sign up or log in to comment