Provided code for vllm not working with vllm-0.1.7

by rpeinl - opened Sep 21, 2023

Sep 21, 2023

Hi,
I'm very curious to compare GPTQ and AWQ based models and tried your Marcoroni AWQ model with the code you provide in the model card. Thanks for that.
However, it is not working in my JupyterLab cluster. I install vllm with pip. It is the latest release 0.17 from last week, but it says it does not recognize the parameter quantization.
TypeError: EngineArgs.init() got an unexpected keyword argument 'quantization'
Do I need a nightly version of vllm?
Regards
René

TheBloke

Owner Sep 21, 2023

Yeah the quantization parameter was added to vLLM three days ago, in this commit: https://github.com/vllm-project/vllm/commit/fbe66e1d0b8d1445cb3204150afac74ab075e559

So yes you'll need to build from Github source, until they do a fresh release - hopefully soon.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment