Provided code for vllm not working with vllm-0.1.7
#1
by
rpeinl
- opened
Hi,
I'm very curious to compare GPTQ and AWQ based models and tried your Marcoroni AWQ model with the code you provide in the model card. Thanks for that.
However, it is not working in my JupyterLab cluster. I install vllm with pip. It is the latest release 0.17 from last week, but it says it does not recognize the parameter quantization.
TypeError: EngineArgs.init() got an unexpected keyword argument 'quantization'
Do I need a nightly version of vllm?
Regards
René
Yeah the quantization
parameter was added to vLLM three days ago, in this commit: https://github.com/vllm-project/vllm/commit/fbe66e1d0b8d1445cb3204150afac74ab075e559
So yes you'll need to build from Github source, until they do a fresh release - hopefully soon.