GPU Requirement

#1
by KhanhVan - opened

How much VRAM does this model require? Is it possible to build VLLM with a 24GB L4 GPU?

The model weight is around 19gb, so it should be possible to run on 24GB, though u prolly can't use very long context.

thanks!!! can you give me advice whether to use gemma 3 27b quantized and gemma 3 12b?

My uneducated guess is that quantized 27B is better than BF16 12B. Sadly Google doesn't publish any benchmarks, so you will need to try them out to be sure.

can i load model gemma 3 12b on two gpu L4?

Sign up or log in to comment