Huge memory usage
#3
by
flymonk
- opened
This model uses than 90GB memory when conversation is not long.
Is that normal or some bug?
Not sure, I saw the same behavior on @awni ’s demo.
However, I managed to run a bytesandbits 4bit quantized version on A5000 with 24GB VRAM.