VRAM usage
#1
by
SerialKicked
- opened
I don't use Gemma 27B models often, but for some reason all the GGUF i try seem to consume a LOT more VRAM than even Qwen3 32B for the same context size and quantization. Is there something funky going on with Gemma3 GGUF files? I tried with both KoboldCpp & LM Studio, made no difference.