Running Model "unsloth/DeepSeek-V3-0324-GGUF" with vLLM does not working
Claims that this model can be run with vLLM. However, this is not the case. Attempting to run the model with following command following the vLLM documentation:
vllm serve /root/.cache/huggingface/hub/DeepSeek-V3-0324-UD-IQ2_XXS.gguf --tokenizer unsloth/DeepSeek-V3-0324-GGUF --tensor-parallel-size 8
Produces following error message:
ValueError: GGUF model with architecture deepseek2 is not supported yet.
Below is the complete error message:
File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/transformers/configuration_utils.py", line 685, in _get_config_dict
config_dict = load_gguf_checkpoint(resolved_config_file, return_tensors=False)["config"]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/transformers/modeling_gguf_pytorch_utils.py", line 401, in load_gguf_checkpoint
raise ValueError(f"GGUF model with architecture {architecture} is not supported yet.")
ValueError: GGUF model with architecture deepseek2 is not supported yet.
We didn't write it works on vLLM. Unfortunately I dont think DeepSeek GGUF's work on vLLM yet :(