GPU Requirement

by KhanhVan - opened May 10

Discussion

KhanhVan

May 10

How much VRAM does this model require? Is it possible to build VLLM with a 24GB L4 GPU?

gaunernst

Owner May 10

The model weight is around 19gb, so it should be possible to run on 24GB, though u prolly can't use very long context.

KhanhVan

May 12

thanks!!! can you give me advice whether to use gemma 3 27b quantized and gemma 3 12b?

gaunernst

Owner May 12

My uneducated guess is that quantized 27B is better than BF16 12B. Sadly Google doesn't publish any benchmarks, so you will need to try them out to be sure.

KhanhVan

May 27

can i load model gemma 3 12b on two gpu L4?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment