Is there any quantized Pixtral Large model that can be run using the vLLM library ?

#22

by HamzaChekireb - opened Jan 9

Jan 9

I am attempting to run a quantized Pixtral model using 160 GB of VRAM with the vLLM framework. However, I have not been able to find any compatible quantized model, as the available ones( https://huggingface.co/models?other=base_model:quantized:mistralai/Pixtral-Large-Instruct-2411 ) are unfortunately not supported by vLLM. Frameworks like vLLM and Ollama are crucial for managing concurrent users and enabling distributed inference effectively.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment