GPU Requirement
#1
by
KhanhVan
- opened
How much VRAM does this model require? Is it possible to build VLLM with a 24GB L4 GPU?
The model weight is around 19gb, so it should be possible to run on 24GB, though u prolly can't use very long context.
thanks!!! can you give me advice whether to use gemma 3 27b quantized and gemma 3 12b?
My uneducated guess is that quantized 27B is better than BF16 12B. Sadly Google doesn't publish any benchmarks, so you will need to try them out to be sure.
can i load model gemma 3 12b on two gpu L4?