Finetuning CUDA OOM
#1
by
Owos
- opened
I'm trying to finetune the model on an H100 but I keep getting CUDA OOM error, please how are you able to fit the model on a single gpu?
On one H100 with only 80GB it would not be possible to fully fine-tune with BF16 precision a 12B model (~24GB for the model, ~24GB for the gradients, ~ 48GB for optimizer states if you use ADAM in addition to more memory for activations ).
You need to use techniques such as LoRA and QLoRA to train using only one H100 GPU.