Running Model "unsloth/DeepSeek-V3-0324-GGUF" with vLLM does not working

#11
by puppadas - opened

Claims that this model can be run with vLLM. However, this is not the case. Attempting to run the model with following command following the vLLM documentation:

vllm serve /root/.cache/huggingface/hub/DeepSeek-V3-0324-UD-IQ2_XXS.gguf --tokenizer unsloth/DeepSeek-V3-0324-GGUF --tensor-parallel-size 8

Produces following error message:

ValueError: GGUF model with architecture deepseek2 is not supported yet.

Below is the complete error message:

File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/transformers/configuration_utils.py", line 685, in _get_config_dict
config_dict = load_gguf_checkpoint(resolved_config_file, return_tensors=False)["config"]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/transformers/modeling_gguf_pytorch_utils.py", line 401, in load_gguf_checkpoint
raise ValueError(f"GGUF model with architecture {architecture} is not supported yet.")
ValueError: GGUF model with architecture deepseek2 is not supported yet.

We didn't write it works on vLLM. Unfortunately I dont think DeepSeek GGUF's work on vLLM yet :(

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment