torch.OutOfMemoryError

#10
by JohnneyQin - opened

I used the given sample program, but there seems to be a problem, he actually has to allocate 700 + GiB of memory, I am not sure if this is correct

torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 727.61 GiB. GPU 0 has a total capacity of 31.75 GiB of which 22.86 GiB is free. Process 2853 has 8.88 GiB memory in use. Of the allocated memory 8.39 GiB is allocated by PyTorch, and 129.11 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

What worked for us was

  1. in from_pretrained on a dual 24gb setup device_map="balanced", max_memory={0:"24GiB",1:"24GiB"}

  2. then as instructed comment out #pipe.enable_sequential_cpu_offload() but don't add back a pipe.to("cuda") since it will default to that and was getting confused when included

Generated these:

Based on this source FLUX frame:
https://liaise.cloud/bob?node=BOB:4UsGAcaVUh8m9Eza52HL8NrgGi9uNzQchc2Nz38CwQzU:2003994:1734790309441:2kzml8nz6etew7ykuihtugvcm57wypuelbibeimik6gk

Prompt details:
https://liaise.cloud/get?node=POW:4UsGAcaVUh8m9Eza52HL8NrgGi9uNzQchc2Nz38CwQzU:482877346:1734790309345:2ly3forqsddqyqnyvkrdam1gkd1uh6bs64duqsisqjq3

Sign up or log in to comment