Thank you! Got more details on the fine tuning?
#1
by
KnutJaegersberg
- opened
Thank you for making this great fine tune, can you help me understand the parameter settings you have chosen to fine tune your model?
I still struggle to find good ones. I see you have chosen different target modules, but what about the other hyperparameters?
Could you list them?
Here's the W&B run:
https://wandb.ai/jondurbin/bagel-jamba-v0.5/runs/h730jkg1/overview?nw=nwuserjondurbin
TL;DR:
- rank 16
- alpha 32
- learning rate 0.0001
- per device batch size 4
- gradient accumulation steps 4
Thanks so much!
KnutJaegersberg
changed discussion status to
closed