Merging 70B models with Lora

#1
by practical-dreamer - opened

Hey Jon thanks for all the effort you put into these finetunes.

Could I ask how you're merging Llama-2 with the adapter? I've tried several scripts on github but they seem to have problems with 70B

Thanks

I use this script:
https://github.com/jondurbin/qlora/blob/main/qmerge.py

I typically run it on an instance with at least 3 80GB A100s, otherwise it tends to OOM or just silently fail to save all the weights.

Thank you! You're a rockstar. I've been trying this on Azure instances... The merge is done with System RAM right?...

... or does it need a GPU?

It ends up using the gpus if available because of the device_map="auto", and I usually have the GPU instances up from the fine tunes anyways; haven't tested on CPU.

practical-dreamer changed discussion status to closed

Sign up or log in to comment